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.1.0 INTRODUCTION 

If an analog transmission system as shown in Figure 1 is to be 
utilized for television, the operations which must be performed on the ■ 
video signal are typically rather straightforward and include premodula- 
tion filtering, modulation, post-modulation fiTtering, and possibly 
several stages of amplification. In the modulation process itself, the 
filtered video signal’ is used to vary (in some manner) the amplitude, 
phase, or frequency of the RF carrier. For applications such as commer- 
cial broadcasting, where bandwidth is at a premium, some form of AM \ 
(such as vestigial sideband, VSB) is generally used, while in other 
applications such as- space communications, which allow a power-bandwidth 
trade, FM is commonly used. Regardless of the form of analog modulation 
employed, the noise (or some part of it) introduced in the channel appears 
at the demodulator output. 

In recent years, there has been a definite trend towards use of 
digital instead of analog techniques for transmission of pictorial data, ’ 
although the trend has been much more pronounced for single-frame images 
(still pictures) than for multi-frame images (television). The rapid 
growth in the use of digital computers, as well as the many inherent 
advantages of. digital communications systems has -undoubtedly resulted 
in this trend. 

In digital transmission systems, the video signals are processed 
in some manner which includes quantization and coding such that they are 
separable from the noise introduced into the -channel. The performance 
of digital television systems, then, is determined by- the .nature of the 
processing techniques (i.e., whether the video signal itself or, instead,' 
something related to the video signal is quantized and coded) and to the 
quantization and coding schemes employed. 

It should perhaps be emphasized, that digital transmission systems 
are inherently more complex than analog systems and thus ’should probably 
not .be considered for those applications in which analog transmission is 
satisfactory in terms of the power/bandv/idth requirements. Digital tech- 
niques do offer some potential advantages, however, in certain circum- 
stances. For example. 
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a. • Digital transmission of television is potentially much more 
efficient than analog transmission, in terms of a power-bandwidth 
tradeoff. 

b. More efficient time-division multiplexing techniques can 
be used if multiple channels of information must be acconmodated. 

c. Digital techniques are easily adaptable for privacy 
transmissions. 

Figure 2 shows a generalized block-diagram of a digital television 
transmission system. The source encoder is assumed to perform all the 
functions unique to a particular processing technique (signal transforma- 
tions, sampling, quantizing, and coding). The basic goal of the source 
encoder is to reduce the number of digital symbols per second required 
to represent the television pictures to within some set of acceptable 
performance criteria. The performance criteria commonly employed for 
television transmission systems include picture quality (as represented 
by the judgment(s) of one or more viewers) and picture resolution (which 
is easy to measure). The source coding goal of digital symbol rate reduc- 
tion is invariably achieved by eliminating redundant or imperceptible 
parts of the signals, and care must be exercised in order to insure that 
picture quality and/or resolution will not be appreciably degraded by the 
resultant introduction of flicker, spurious contours or patterns, etc. 

It should be noted that, although digital transmission techniques are 
generally thought of as requiring an increased RF bandwidth, it is, at 
least in principle, .conceivable that the digital data rate can be reduced 
such that the required bandwidth is actually 1 ess than for analog trans- 
mission systems. 

The purpose of this survey of television digitization/compression 
techniques is to summarize the key features, advantages, and disadvantages ’ 
of several of the various source coding techniques which are in use today 
or which have been proposed. Since the very nature of any source coding 
scheme for television involves. the process of quantization, then no such 
schemes are entropy-preserving in the strict sense (i.e., .some information 
is. always lost and this loss in picture information is manifested as quan- 
tization noise). However, given that the initial quantization process is 
such that the loss in picture information is acceptable, then it is possible 
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to do further source encoding in an entropy-preserving manner. ■ It is 
also, of course, possible to lose additional information by further source 
encoding. 

Figure 3 illustrates one categorization which can be made of the 
various source encoding approaches for television, in which a large class 
of techniques attempts to "track" or "follow" the video waveform exactly, 
and in which the other class attempts to merely reproduce something 
which looks nearly like the original image sequence. Many other source 
encoding categorizations can and have been made. 

As will be pointed out in the subsequent discussion, and as indi- 
cated in Figure 4, the TV source coding problem can be visualized as a 
one-, two-, or three-dimensional problem. Various combinations of the 
source coding techniques shown in Figure 4 can be used in exploiting the 
spatial and temporal redundancies present in a television signal. The 
discussion to follow will first (in the classic manner) consider source 
coding techniques which are applicable to black-and-white images. What 
can be done to code color images will be considered in a later section 
of this report. 
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2.0 WAVEFORM-DIGITIZATION METHODS 

2.1 Pulse Code Modulation (PCM) 

As illustrated in Figure 5, when using PCM transmissions for tele- 
vision, the continuous image is first sampled in the spatial domain (within 
a frame) to produce an MxN. array of discrete samples (called picture 
elements, or pixels ) which are then quantized in brightness by using one 
of 2^^ levels to represent the value of each sample. The total number of 
bits per frame to be transmitted is then given by 

• B = MN K bits/frame 

and the required rate to accommodate F frames per second is 

RpCM ■ BF = MNKF bits/second . 

As shown in Figure 6, for U.S. commercial television systems, 

N = 525, M=512, F = 30-, and (typically) K=6 to 8. Thus, PCM transmission 
of U.S. commercial ■ television would require a transmission rate of approxi- 
mately 48 Mbps to 64 Mbps. 

A PCM system. is capable of transmitting any picture, including those 
which contain only uncorrelated samples (no redundancy). Pictures con- 
taining only uncorrelated samples are, however, not of any particular 
interest, since considerable correlation must be present between samples 
in order for an image to be present. In fact, statistical analyses of a 
wide range of television scenes [1] has indicated that the information 
content (entropy) of a typical frame is about one bit per pixel. It 
should therefore be possible, by proper source encoding, to use a trans- 
mission rate of only one bit per pixel, with no loss of information. An 
even further reduction in bit rate (by a factor of perhaps two to five) 
should be achievable if temporal redundancy is exploited. 

Although' only a single bit per pixel should theoretically be 
required to represent an average television frame, it has been found for 
PCM systems [2,3] that, when K is reduced to less than about six bits per 
pixel, an image degradation known as the contouring effect results. This 
effect is due to the formation of discrete rather than gradual changes in 
brightness. Several techniques have been developed [4] for reducing the 
contouring .effect, including the addition of pre-emphasis and de-emphasis 



F = FRAME RATE (FRAMES/SECOND) 

- NO. OF LINES/FRAME = VERTICAL RESOLUTION 



M = NO. OF ELEMENTS/LINE 
- HOR-IZONTAL RESOLUTION 

e NO. OF PICTURE ELEMENTS (PIXELS) PER FRAME = MN 
e NO. OF PIXELS PER SECOND = MNF 

e ALLOWING ONE SAMPLE PER PIXEL AND CODING EACH SAMPLE AS ONE OF 2*^ LEVELS 
. GIVES THE REQUIRED PCM ‘TRANSMISSION RATE: 

= MNKF BITS/SECOND 


FIGURE 5. PCM PROCESSING OF TV 
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networks, use of pseudorandom dithering, and nonuniform quantizing, but 
generally it is felt that PCM is not suitable if digital rates below 
about 40 Mbps are required and high quality/resolution are required. 

In summary, the primary disadvantages of PCM television are: 

a. The bit rate require^ for acceptable picture quality is too 
high (6 bits/sample versus an entropy of ~1 bit/pixel for typical scenes). 

b. Picture quality degrades rapidly for lower sampling rates or 
for a reduced number of bits per sample. 

c. No attempt is made to exploit element-to-element, line-to-line, 
or frame- to-frame redundancy. 

d. No consideration is given to the psychovisual properties of 
the eye. 

2.2 Entropy Coding Techniques 
2'. 2.1 General 

Entropy coding techniques, as defined in this report, are those 
techniques which take advantage of the correlation present between adjacent 
picture samples (quantized or unquantized) along a scan line (i.e.', element- 
to-element redundancy).. Such techniques, then, rely on the more likely 
occurrence of some sample values than others. As a very simple and intro- 
ductory example of entropy coding, consider the following case in which 
one of two binary source symbols A (possibly corresponding to no change 
from the previous sample values) and B (corresponding to a change) having 
probabilities 


P(A) = 0.9 
P(B) = 0.1 . 

Using a codeword assignment of 

0 A 

1 <=> B , 

the average output word length would be 

L.J = 1.0 bit/symbol , 
corresponding to a transmission rate 


(2.1) 


( 2 . 2 ) 


(2.3) 


S bits/second 


(2.4) 
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over the channel, where S is the sampling or source rate (in symbols per 
second) . 

By encoding pairs of source symbols (the second order extension of 
the source) and assigning the shortest codeword to the most likely pair,’ 
some reduction in transmission rate should be achievable. Since 

P(AA) '= (0.9)(0.9) = 0.81' 

P(AB) = P(BA-) = 0.09 (2. 

P(BB) = 0.01 , 
then the codeword assignment 


0 AA 
10 ^ AB 
no <=> BA 


( 2 . 6 ) 


111 BB 

results in an average output word length of 

^2 ~ (1)(0.81) + (2)(0.09) + (3)(0.10) = 1.29 bits/symbol pair 

which corresponds (since the rate of occurrence of each symbol pair is 
S/2) to a transmission rate of 

R2 =■ (S/2)L2 = 0.65S bits/second . (2.7) 

Continuing with the third-order .extension of the source, and .using the 


codeword assignment below, 

P{ ) 

Codeword 

AAA 

0.729 

• 0 

■ AAB 

0.081 

100 

ABA 

0.081 

101 

BAA 

0.081 

no 

ABB 

0.009 

11100 

.BAB 

0.009 

11101 

BBA ■ 

0.009 

lino 

BBB 

0.001 

11111 


we obtain an average output word length of 
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(1)(0.729) + (3)(0.243) + (5)(0.028) = 1.598 bits/symbol trio 
which corresponds to a transmission rate of 

R3 = {8/3)13 ~ 0.53'S bits/second . (2.9) ' 


The above procedure could' obviously be continued for higher-ordered 
source extensions, although it would appear that the improvement must 
eventually tend toward zero, since the transmission rate must always be 
non-zero. In fact, one version of the fundamental source coding theorem, 
states that the minimum digital rate required to represent a source 
having entropy H [bits/symbol] and rate R [symbols/sec], can be made as 
close to HR [bits/second] as desired, where 



M 

- I P{Xj^) log„ 
k=l ^ 



( 2 . 10 : 


- average information per source symbol x. . 

K 

For the original source under consideration, with ’ 

x.j = A,P(x.|) = P(A) = 0.9 . 

X2 = B,P(x 2) = P(B) = 0.1 (2.11) 

R = S , 

then 

H = -0.9 log^ 0.9 - 0.1 log^ 0.1 = 0.47 bits/symbol . (2.12 

Thus, the minimum possible digital rate to represent this source is 

Rpiin 0,47 8 bits/second (■2. 13) 

which is nearly achieved by the simple third-order source extension. 

As illustrated by the preceding example, entropy coding basically 
assigns shorter word lengths to those, source symbols (or groups .of symbols) 
which occur more frequently and longer word lengths to those v/hich occur 
less frequently, 8ubsequent sections of this report will more formally 
present basic entropy coding concepts and describe several methods for 
systematically making efficient code word assignments. 
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2.2.2 Basic Entropy Coding Concepts 

Consider a source which provides symbol s from a source alphabet 

{^1 » ^2’ ' ' ‘ ’ ^n ^ 

with a corresponding set of symbol probabilities 

{P{x^), PCx^), ..., P(x^)} . 

As illustrated by the example of section 2.2.1, an entropy code essentially 
takes sequences of the source symbols and forms multi -symbol messages 

{m.| , m^ 5 > • • , m^^ } 

with a corresponding set of message probabilities 

{p(m-j), P(m 2 ) » ...j P(iTip)} • 

If 

n-| = Number of symbols in message 

02 = Number of symbols in message m 2 

= Number of symbols in message mj^ , 

. then we can define an average message length: 

M 

L = . 1 ^ P(m^)n. . (2.14) 

The goal of entropy coding is to find a set of messages m.. such that U is 
minimized. To obtain a lower bound on U, we first define 

H(X) = P(x.) log 2 P(x.) (2.15) 

as the source entropy (in bits/source symbol) and 

N 

H(M) = - I P(m.) logp P(m.) (2.16) 

as the message entropy (in bits/message). Noting that H(X) is maximized 
when 

P(x.) = ~ for all i ( 2 . 17 ) 

and that 

" '“Sz " ’ (2-18) 
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then it can be seen that 


H(x) = m UMts/inessageJ ^ 

L [symbol s/message] 

We see that the minimum possible value for L is given by 

_ H(M) 


(2,19) 


min • log^ n ' 


(2.20) 


^ ~ min’ redundancy is zero and that the efficiency 

is 1 (100%), where 


and 


n = Encoding Efficiency = 

Redundancy = 1 - efficiency 

= 1 _ H(M) 

L log^ n 

L 1o 92 n - H(M) 
L log^ n 


H('M) 


tmn 


L log2 n 


(. 2 . 21 ) 


( 2 . 22 ) 


For the encoded messages, the average probability of occurrence of the ith 
source symbol is given by 


= -N 


M N ■" 

,5! P{\) Ck. . I P(m. ) Ck- 
_ k=l ^ i k=l ^ ^ 


(2.23) 


I P(m. ) n 


k=l 


k' "k 


where Ck^- = number of x^'s in the l<th message, 

N 

I P(ni|.) Ck. = average number of x. 's per message . (2.24) 

k=l ‘ 


To illustrate the utility of the above source coding concepts, con- 
sider the first-, second-, and third-order extensions of the binary- source 
used in the example of section 2.2.1: 



8 First-order extension 


""1 ^ 

= A 


■P(x^) = 

0.9 



^2 

= B 


Pix^} = 

0.1 



m-j <=> 0 


P(m-]) = 

0.9 



m 2 1 


P(m 2 ) = 

0.1 



H(X) 

= 

-0.9 

1o92 0.9 - 

0.1 

1o92 0.1 = 

0.47 bits/symbol 

H(M) 

= 

-0.9 

logg 0.9 - 

0.1 

1.092 0.1 = 

0.47 bits/message 

L = 

( 0 . 

9)(1) 

+ ( 0 . 1)(1 

) = 

1.0 symbols/message 

H(X) 

max 

= 1 

092 " = 

1.0 bit/ symbol 


*'min 

= 

H(M) 

1092 

_ 0.47 
n 1.0 

. = 

0.47 symbols/message 


n = . 

Imin 

L 

0.47 
- 1.0 

=■ 0.47 

= 47% 




Redundancy = 

1 - n 

= 0.53 

= 53% 




Second 

-order extensii 

on 





mi 

00 ■«=> 

AA ■ 

P(m-,) 

= P(A) 

P(A) = 

0. 

,81 


01 <=> 

AB 

P(m 2 ) 

= P(A) 

P{B) = 

0, 

,09 

m 3 

10 <=:> 

BA 

P(m 3 ) 

= P(B) 

P(A) = 

0. 

,09 

•"4 

11 ^ 

BB 

P(r14) 

= P(B) 

P(B) = 

0. 

.01 

H(X) 

= 0.47 

bits/symbol 





H(M) 

= -0.81 

1092 ' 

0.81 - 0.09 1o92 I 

0.09 




- 0.09 1092 
= 0.94 bits/message 

L = {0.81)(2) + (0.09)(2) + {0.09)(2) + (0.01)(2) 


H(X) 


2 symbol s/nessage 
= '1 bit/ symbol 


max 
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r = h(m) 

min 1 o 92 n 


0.94 symbols/message 


n = 


min 0.94 


0.47 = 47"^ 


Redundancy = 1 - n 


53% 


Note that this uncoded second-order source extension provides no 
improvement in efficiency, or equivalently, no reduction in redun- 
dancy. Also note that we still have P(0) = 0.9 and P(l) = 0.1. 

® Second-order extension (Source coding assignment using fewer 
symbols to represent more likely messages) 


m.| 

0 

<=> AA 

P(m^) 

= 0.81 

m 2 

10 

<t=t> ab 

P{m 2 ) 

= 0.09 

m 3 

no 

BA 

P{ni3) 

= 0.09 


in 

BB 


= 0.01 

H(X) 

= 0 . 

.47 bits/symbol 


H(M) 

= 0 , 

.94 bits/message 



L = (0.81)(1) + .(0.09),(2) + (0.09,)(3) + (0.01)(3) 
= 1. '29 symbols/message 


H(X) 


max 


1 bit/ symbol' 



0.94 symbols/message 


n 



0.94 
1.29 " 


0.729 


72.9% 


Redundancy = 1 - n = 0.271 = 27.1% 

p( 0 ) = (0-81)(l) + (0.Q9)(1) + (0.09H1) 

1.29 


0.77 


P(l) = 1 - P(0) 


0.23 
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0 Third-order extension (no source coding) 


mi 

000 


MA 

P(m^) 

= 

0.729 

m2 
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m 3 
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P(m 3 ) 

= 

0.081 

^4 

100 
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- 
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on 

<=> 

ABB 

Pfnig) 

= 

0.009 
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<=c> 

BAB 

P(mg) 


0.009 


no 

<==> 

BBA 

P(m^) 

= 

0.009 

ms 

111 

<=> 

BBB 

P(niQ) 


0.001 


It can be readily shown for this case that 


H(M) = (0.47){3) = 1.41 bits/message 



n = 


= 3 symbols/message 



0.47 


47% 


Redundancy = 1 - n - 53% 
P(0) = 0.9 

P(l) = 0.1 


0 Third-order extension (Source coding assignment using fewer 


symbols to represent more 

likely messages) 

m.| 

0 

<?=:> 
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■P(m^) 

= 
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m2 

100 

<=> 

AAB 

P(ui2) 

■ = 

0,081 

m3 
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<?=> 

ABA 

P(m3) 


0.081 

m^ 

no 

-- 

BAA 

P(ni4) 

= 

0.081 

*"5 

11100 

<=> 

ABB 

P(m5) 

= 

0.009 

"^6 

11101 


BAB 

P(mg) 

= 

0.009 

m^ 

lino 


BBA 

P(m^) 


0.009 

m3 

11111 


BBB 

P(mg) 

= 

0.001 
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H(X) = 0.47 bits/symbol 
H{M) = 1,41 bits/message 

L = (0.729)(1) + (0.081)(9) + (0.009)(15) + (0.001){5) 
= 1.598 symbol s/message 
H(X)jmax ^ bit/ symbol 

L rnin " symbols/message 

= Ilm = _L:g_. = 0.882 = 88.2% 


Redundancy = 1 - n = 0.118 = 11.8% 

P(0) = (0>729)(1) -f (0.081U4) + (0.009)(4) ^ ^ gg 

1.598 

P(l) = 1 - P(0) = 0.32 


It is evident from the foregoing discussion that, by forming multi- 
symbol messages and assigning fewer symbols to more likely messages than 
to less likely messages, significant improvements in efficiency (corre- 
sponding to reductions in average word length U) can be obtained. By 
proper code assignment, it is ultimately possible to make 

L = L - 

min log2 n 

n = 100% 

Redundancy =0% (2.25) 

P(x^. ) = ^ for all i . 

As yet, however, we have no information which tells us how to make the 
best code word assignments to the various messages. Subsequent sections 
.of this report. consider various structured algorithms for such code word 
assignments, 

2.2.3 Shannon-Fano Coding 

The Shannon-Fano source coding algorithm has the advantage of being 
probably the simplest of the various entropy coding approaches. For 
binary coding, given a set of messages, the algorithm may be simply stated 
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as follows: 

a. List messages 1n order of decreasing probability. 

b. Partition messages into the two most equiprobable subsets. 
Assign a 0 to each message in one set and a 1 to each message in the 
other. 

c. Repeat (b) for each of the new subsets, and continue to 
repeat until each subset contains only one message. 

The above procedure may best be visualized in terms of an example, 
such as that shown in Table 1, in which a set of five messages is encoded 
using the Shannon-Fano procedure. 


Table 1. Applications of Shannon-Fano Coding Algorithm 


Message 

Probability 

First 

Partitioning 

Second 

Partitioning- 

Third 

• Partitioning 

Code Vlord 
Assignment 

mi 

1 

2 

1 

2 

1 ^ 

1 

2 

0 

1 

2 

1 

• 0 

0 


1 

6 



1 

► 10' 

1 

6 

100 

100 

m3 

1 

6 

1 

■ 1 

3 

1 

6 

101 

101 

"’4 

1 

12 

2 

1 

i/ 

^ 11 

! 1 

1 1 
12 i 

j 

no 

■ no 

1 


1 

12 


■ 

6 

1 

12 

1 

111 


Note that, for the Shannon-Fano coding example shown above, we have 

H(M) = log 2 ij) - ^ logg (-^) - 1 log 2 (^) = 1.96 bits/message 

L = {^}(1) + (■^){3) = 2 bits/symbol 
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min 


H(H) _ 1.96 

1092 ^ ~ ^ 


n = 


^ min 1 .96 


0.98 


Redundancy = 1 - n = 0.02 


1.96 
= 98% 

= 2 % 


Thus, the simple Shannon-Fano encoding procedure yields a code v/hich is 
very nearly optimum- (100% efficient). Should it be desired to further 
improve efficiency, however, this can be accomplished by repeating the 
above procedure for longer messages and message sets. 

The extension of the Shannon-Fano encoding procedure for nonbinary 
(n?^2) sources is straightforward. For general n-ary sources, the messages 
are merely partitioned into the n most equiprobable subsets, with the 
first source symbol assigned to the first subset, the second symbol 
assigned to the second subset, etc. This procedure is then repeated 
as before. 

For purposes of encoding television signals, the .Shannon-Fano coding 
procedure is applied to best advantage when the messages represent differ- 
ence between adjacent sample values, rather than the absolute sample 
values themselves. This is, of course, because of the high degree of 
correlation between adjacent samples, resulting in many difference values 
at or near zero. 

As an example of the possible utility of a coding technique such 
as the Shannon-Fano approach, consider the results of a tabulation of 
amplitude-difference statistics for a large number of conditons and .for 
several different pictures. Using 4-bit quantization of picture samples, ' 
the difference statistics shown in Table 2 are observed. A practical 
scheme for applying the Shannon-Fano coding algorithm is shown in Table 3, • 
where delta va.lues greater than 3 are represented by the Shannon-Fano 
4-bit code word assignment, augmented by the 4 bits representing the 
absolute difference value. 

2.2,4 Huffman Coding 

Perhaps the most conceptually important approach which falls in 
the class of entropy coding is the Huffman coding procedure, which is 
similar to, although somewhat mqre complicated than, the Shannon-Fano 



A 

P(A) 

0 

0.78 

+1 

0.08 

-1 

0.08 

D 

0.02 

+2 

0.01 

-2 

0.01 

+3 

0.007 

-3 

0.007 . 

HS 

0.004 

VS 

0.002 


1.000 


+3 < D < -3 


H$ = Horizontal Sync 
VS = Vertical Sync 


Table 2. Television Difference Statistics for 4-Bit PCM 


A 

P(A) 

Code Word 
Assignment 

0 

0.78 

1 




'■ 




+1 

0.08 

0 

1 







-1 

0.08 

0 

0 

1 






D 

0.02 

0 

0 

0 

1 

X 

X 

X 

X 

+2 

0.01 

0 

0 

0 

0 

1 

1 



-2 

0.01 

0 

0 

0 

0 

1 

0 



+3 

0.007 

0 

0 

0 

0 

0 

1 



-3 

0.007 

0 

0 

0 

0 

0 

0 

1 


HS 

0.004 

0 

0 

0 

0 

0 

0 

0 

1 

VS 

0.002 

0 

0 

0 

0 

0 

0 

0 

0 


Table 3. Application of (Modified) Shannon-Fano 
Coding Procedure to Television Difference Signal 
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procedure. The importance of Huffman coding is that it is an optimum 
approach, in the sense that, for a given message set, the average word 
length is guaranteed to be the minimum achievable value. The relative 
complexity of Huffman coding as opposed to, say, Shannon-Fano coding 
makes it probably not extremely attractive for some applications, but 
it is felt to be potentially applicable to television transmission and 
therefore will be described here. 


a. For an optimum code, the longer code word should correspond 

to a message with lower probabil ity, thus, if for convenience the messages 
are numbered in order of nonincreasing probability, 

P(m^) > P{m2) > P{m3) > •.• ^ P(mj,p , (2.26) 

then the corresponding code word lengths' are such that 

L(m^) <. L(m2) ^ L(m3) ^ ^ L(mj,j) . (2.27) 

b. For an optimum code, it is necessary that 

L(mj^_.]) = L(mj^j) . ( 2 . 28 ) 

If we assign similar code words to mj^ and mj^_.| ' except for the final symbol, 

our purpose is served. Any additional symbol for mj^ and mj^ unnecessarily 

increases L. Therefore, at least two messages, and mj^, should be 

encoded in words of identical length. However, not more than n such 

/ 

messages could have equal length. It can be shown that, for an optimum 
encoding, (the number of least probable messages which should be 
encoded in words of equal length) is the integer satisfying the require- 
ments : 


- n. 


n- 1 


= integer ; 


2 :< Oq :! n 


(2.29) 


c. Each sequence of length L(mj.^) - 1 symbols either must be used 
as an encoded word or must have one of its prefixes used as an encoded 
word. 


In the following, we shall initially restrict ourselves to the 
binary case (n = 2). Condition (b) requires that the two least probable 
messages have the same length, and that the two encoded messages be 
identical except for their last symbols. We shall select these two 
messages to be the j^th and (N-1 )t h original messages. After such a 
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selection, we form a composite message out of these two messages with a 
probability equal to the sum of their probabilities. The set of messages 
in which the composite message is replacing the aforementioned two mes- 
sages will be referred to as an auxiliary ensemble of order 1 , or simply 
AEl . Now, we shall apply the rules for finding optimum codes to AEl; 
this will lead to AE2, AES, and so on. The code words for each two least 
probable members of any ensemble AEk are identical except for their, last 
symbols,, which are 0 for one and 1 for the other. This iteration cycle 
is continued until AEp has only two messages. A final symbol 0 Is 
assigned to one of the messages and 1 to the other. Now we simply trace 
back our path and remember each two messages which have to differ only 
in their last digits. The optimality of this procedure is a direct con- 
sequence of the previously described optimal steps. 

The example shown in Table 4 illustrates clearly the application 
of the Huffman coding procedure. 

The Huffman encoding procedure described above is readily adaptable 
for nonbinary code word assignments. The key to nonbinary Huffman coding 
is the requirement of condition (b.), as previously described. Thus, for 
the ternary (n = 3) example shown in Table 5, the number of least probable 
messages which must be encoded in words of equal length is that value of 
Oq which satisfies 

■ N - no 6 - no 

= = integer ; 2 < n„ < 3 (2.30) 

n - 1 3-1 ^ 


or 



After the initial selection of rig = 2 least probable messages, and formu- 
lation of AEl, subsequent subsets of three least probable messages are 
used to generate new auxiliary ensembles. 


Data Compression Techniques 


2.3.1 General 


One approach for reducing transmission rate requirements when .the 
data to be transmitted is repetitive in nature (i .e. ,■ contains redundancy ) 
is to employ a transformation to remove those samples which are declared 
redundant. Such redundancy reduction or data compression techniques are 



Table 4. Application of Binary Huffman Coding Procedure 


Code Word 



1100 nig 0.05 — ^ 


Table 5. Application of ternary Huffman Coding Procedure 


Code Word 
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generally not entropy-preserving in a strictly classical sense, because' > 
some information is lost; however, since the processes are reversible 
to within a specified allowable tolerance or peak error, they are fre- 
quently considered to be in the class of information-preserving 
techniques. 

Several types of redundancy reduction algorithms have been formu- 
lated and many more are conceptually possible. In general, hov/ever, they 
may be classes as either predictors or interpolators . Because of the 
correlation that exists from sample to sample, a prediction of the value 
of the next sample can be made. If the prediction is close enough (to 
within some allowable tolerance determined by the particular algorithm 
employed), then there is no need to transmit the sample. Since the same 
prediction can be made at the receiver, the missing samples can be recon- 
structed to within the transmitter tolerance. 

Polynomial data compression techniques have been widely reported 
on in the literature and actual hardware has been built to implement such 
techniques. The polynomial compression schemes generally conceded to be 
most promising for a wide range of data sources are listed below, and 
will be described in some detail in subsequent sections of this report. 

1. Zero-order Predictor (ZOP) 

2. Zero-order Interpolator (ZOI) 

3. First-order Predictor (FOP) 

4. First-order Interpolator (FOI) . 

As a general rule, the type of algorithm to use in a particular applica- 
tion is highly dependent on the data characteristics of the source. 
Higher-order polynomial compression algorithms (second-order predictors 
and interpolators, third-order, etc.) are, of course, in most applications 
because of their inherent design complexity and because of their tendencies 
to oscillate when operating on data which has small fluctuations such as 
those which might be caused by noise (the tendency to oscillate when oper- 
ating on noisy data is also characteristic of the first-order predictor [5]). 

2.3.2 Buffer Memory Requirements 

When a data compression system is added to the system, the relation- 
ship between the data and its channel and the time information changes. 

The data is input to the data compression algorithm from the computer in 



ZJ 


a- synchronous manner. After the reversible transfer takes place, some of 
the channels of a frame will not be output. The result is an asynchronous 
output from the data compression algorithm. By asynchronous we mean the 
output appears in a seemingly random fashion and the synchronous frame 
is eliminated. 

A major subsystem of the data compression system is the buffer . 

The buffer is needed to reestablish a synchronous bit rate so that standard 
systems can be used for transmission of compressed data. The buffer, when 
implemented into a hardware system, must be some sort of storage device 
such as a magnetic core memory. The memory will have a finite storage 
capability. If the average read-in rate/read-out rate goes close to 1, 
such as during an active data burst, then many channels will have outputs 
from the compression algorithm, and the probability of buffer overflow 
will be large. Buffer overflow is the term used to describe the condition 
of full buffer memory. Any additional inputs to the memory cannot be 
stored and are lost — hence overflow. The loss of any data, especially 
at a time of some important event, cannot be allov/ed. When it is recalled 
that one compressed data point represents many uncompressed data points, 
tha solution of the problem of buffer overflow clearly becomes mandatory. 

There are several ways of controlling buffer overflow. One method 
commonly reported in the literature is adaptive aperture control . This 
type of buffer control operates by sensing either buffer fullness or data 
activity, or both. Depending upon the fullness or amount of data activity, 
the tolerances within the data compression algorithm are increased in 
order to decrease the number of significant samples output from the data 
compression algorithm to the buffer. An obvious disadvantage of this 
method is the increase of peak error at an active period when the data 
may be of the most interest to the experimenter. The philosophy used to 
justify this method implies that, because of the higher activity, the peak 
errors are not as noticeable due to steep slopes and sudden changes when 
viewed by the human eye, assuming human data examination. 

A second method of buffer control is called adaptive sampling . 

With this type of adaptive control, the buffer changes the sampling 
sequence in the commutator to reduce the sampling rate of active chan- 
nels if overflow is imminent. If a randomly addressable memory was used 
for a commutator, this could be a completely flexible operation as any 
sampling sequence could be set up. 
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Another method of buffer control is that of adaptive filtering . 
Here the data is sampled near the frequency of the system noise. An 
average is taken of N points, thus smoothing the data.. This average is 
transmitted to the data' compression algorithm'and from there on to the 
buffer. If data activity increases, tending to cause the buffer fullness 
condition, the number of samples averaged (N) will be increased, thus 
decreasing the total number of points input to the data compression 
algorithm and buffer. 

2.3.3 Sample Identification and Timing Methods 

Since the nonredundant samples provided by a data compression 
device may occur at anytime, some means of time tagging them must be 
provided. This can be done in many ways; however, the simplest way is 
to follow each nonredundant sample with a run-length code (that is, fol- 
lowing them with a code word which tells how many samples were dropped 
since the last nonredundant sample). The required length of this word 
is a function of the data. If, for instance, a sample reduction of 8 
or 10 is expected, a 4-bit code could be used. Even if the sixteenth 
sample is redundant, it is considered nonredundant and the process is 
star-ted over, thus limiting the maximum sample reduction to 15 to 1. 

Care must be used in choosing the length of this code, since each added 
bit must be transmitted with every nonredundant sample. 

Another possible timing scheme is to transmit the actual time when 
a nonredundant sample occurs. This requires many bits and, in general, 
is not used. Some modification may be used, however, such as transmitting 
a master time at fixed intervals and then transmitting the time since the 
last master time. This can be dangerous in case the master time should 
be lost, since all data also will be lost until another master time is 
received. 

Another, and common, method used in multi-channel systems is to 
transmit a code number at the beginning of each frame, and then tag each 
nonredundant sample with a channel number. Since a particular channel 
will always appear at a known location within the frame, time can be 
established. 

As mentioned earlier, the number of nonredundant samples does not 
increase as the sampling rate is increased- once enough samples are avail- 
able to define the reconstructed waveform. If, however, a run length 
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timing code is used and the sampling rate is increased, the more times 
the run length will be exceeded, and the nonredundant samples will increase. 
When a run length code is used together with large tolerances, there is 
a good chance the run length will be exceeded. 

2.3.4 Description of Polynomial Compression Algorithms 

2.3.4.1 Zero~0rder Predictor (Figure 7) 

This algorithm predicts that the following samples will be within 
a specified tolerance of the first sample. If this prediction is true, 
the data sample is considered redundant and is not transmitted. The 
first sample which falls outside the tolerance level is considered non- 
redundant, is transmitted, and becomes the new. reference for subsequent 
predictions. Tolerances will normally be different for each channel, 
but vn’ll be variable by command. 

2. 3.4. 2 Zero-Order Interpolator (Figure 8) 

In this algorithm, the tolerance is set about the first data 
point as was done in the Zero-Order Predictor. When the next point is 
examined, the tolerance is set up about it. The windows of the toler- 
ances of the two points are checked for an overlap. If the windows do 
overlap, a new window upper bound is found by taking the minimum of the. 
upper bound of the old window versus the new point plus the half- tolerance. 

A new lower bound is found by similarly taking the maximum of the lower 
bound versus the new point minus the half- tolerance. The tolerance is 
set up about the third point, the window formed, checked for overlap 
against the last window generated, a new window formed, etc. When the 
windows fail to overlap, an average value is found by taking the average 
of the upper and lower bounds. This value is then transmitted and a new 
run started. 

2. 3. 4. 3 First-Order Predictor (Figure 9) 

In this algorithm, the first point is transmitted. The third 
point is predicted to be on the line connecting the first and second 
points, plus or minus the given tolerance. If- the point is in tolerance, 
the next point is predicted on the same line, etc. If the new point was 
outside the tolerance limits, the previous predicted sample is trans- 
mitted and a new run started using the next value. 
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Figure 7. Zero-Order Predictor 
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X TRANSMITTED POINT 
/ SAMPLED POINT 
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Figure 8. Zero-Order Interpolator 
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Figure 9. First-Order Predictor 
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2. 3. 4. 4 First-Order Interpolator (Figure 10) 

In this algorithm, the first point is transmitted. The given 
tolerance is placed about the second point and lines drawn from the first 
point through the limits of the tolearance. If the third point is within 
the "fan" thus formed, a new "fan" is started by dravnng lines between 
the first point and the tolerance limits placed about the third point. 

The new "fan" will be the intersection* of the two "fans." Subsequent 
"fans" are formed until a point does not fall within the "fan." A mean 
value of the last tolerance spread is transmitted and a new sequence 
begun using the last actual point. 

2.4 Predictive Coding Techniques 

2.4.1 General 

As pointed out earlier, PCM systems are inherently inefficient 
in coding correlated data such as images, since the PCM encoder always 
assigns the same number of binary digits to each of a number of corre- 
lated variables. As will be described later, transform coding systems 
involve an operation on these variables to uncorrelate them prior to 
their quantization. Another approach to the problem of generating a set 
of uncorrelated variables uses classical prediction theory. A predictive 
coder, as illustrated in general form in Figure 11, forms a prediction 

f'S t 

Sq of each data sample Sq (generally by using n previous samples S., 
i = 1 ,2, . . . ,n) . A differential signal 

^ ^0 ' ^0 (2.31) 

is then formed, and it is this differential signal that is quantized and 
coded for transmission. 

2.4.2 Differential PCM (DPCM) 

DPCM is a form of predictive coding in which the predictor provides 
an estimate Sq of each sample Sg, based on the previous n samples by the 
linear operation 

§0 = ( 2 - 32 ) 

* 

Intersection here refers to the intersection of two sets, fan 1 
and fan 2. 
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Figure 10. First-Order Interoolator 
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(a) Transmitter 



(b) Receiver 


Figure 11. Predictive Coding System (General) 
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where the weighting coefficients are chosen to minimize the variance 
of the differential signal for efficient quantization and coding. 

A DPCM encoder can form its prediction based on observations 
(samples) in only one dimension (such as along a scan line), as indi- 
cated in Figure 12. Extension of DPCM to two and three dimensions is 
conceptually straightforward' and will be described later in this report. 

For nonadaptive DPCM, the predictor coefficients A^. are fixed 
constants, optimized (hopefully) for the statistics of the signal. Such 
a system performs well only if the original signal is stationary and has 
about the same statistics with which the predictor coefficients were 
designed. Since signal statistics for images vary greatly from scene 
to scene, adaptive techniques must generally be employed for best 
performance. 

2.4.3 Adaptive DPCM (ADPCM) 

2. 4. 3.1 DPCM Systems with Adaptive Predictors 

Optimal encoding of the nonstationary differential signal requires 
a variable quantizer which would change to accommodate the variations in 
the differential signal. In designing an adaptive DPCM system, one must 
either use a predictor with variable parameters such that the parameters 
would change with the variations in the signal (always generating a sta- 
tionary differential' signal) or one can use affixed predictor with a 
variable quantizer to accommodate the resultant nonstationary differential 
signal. In addition to the above two adaptive systems, the adaptivity 
can be incorporated in the system by using a variable sampling rate and 
fixing both the predictor and the quantizer. 

In a DPCM system with an adaptive linear predictor, the weightings 
on the adjacent samples used in predicting an incoming sample can change 
according to variations in the signal value. Atal and Schroeder [6] 
studied the performance of such an adaptive system for voice signals. 

Their proposed system included a 5 millisecond delay during which the 
incoming samples were stored in an input buffer and were used to obtain 
an estimate of signal covariance 'matrix. The measured covariance matrix 
was used to obtain a set of weightings for the predictor. These values 
were then used for processing the stored signals. The updated values of 
the predictor coefficients need to be transmitted to the receiver once 
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(a) Pixels Employed in Predicting Sg 



(b) Transmitter 



(c) Receiver 


Figure 12. One-Dimensional DPCM Coding System 
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every 5 milliseconds. Atal and Schroeder used the variable predictor v/ith 
a two-level quantizer and reported good coding results. Although identical 
systems can be implemented for coding pictorial data, this type of system 
has not been reported on in the open li.terature. Instead, researchers 
have used adaptive DPCM systems with a fixed and simple predictor and an 
adaptive quantizer. 

2. 4. '3. 2 DPCri Systems with Adaptive Quantizers 

A DPCM system with a fixed predictor will have a nonstationary 
differential signal for nonstationary data. Using a fixed quantizer, 
nonstationary differential signals would cause an abnormal saturation 
or a frequent utilization of the smallest level in the quantizer. To 
remedy this situation, the threshold and the reconstruction levels of 
the quantizer must be made variable to expand and contract according to 
signal statistics. Adaption of the quantizer to signal statistics is 
accomplished using various approaches. Virupaksha and O'Neal [7] sug- 
gested an adaptive DPCM system for speech signal that stores 25 samples 
of the differential signal to obtain an estimate for the local standard 
deviation of the signal. Then the stored signal is normalized by the 
estimated standard deviation and is quantized using a fixed quantizer. 
Naturally, the scaling coefficient must be transmitted once for every 
25 samples for receiver synchronization. Ready and Spencer [8] use a 
similar approach in a system called Block-Adaptive DPCM, which they use 
for bandwidth compression of monochrome images. In Block-Adaptive DPCM 
systems, a block of M samples is stored and is normalized by n possible 
constants. The total distortion for all M samples using each normalizing 
constant is calculated at the encoder. The normalizing constant giving 
the smallest distortion is used to scale the samples in the block prior 
to their quantization and transmission. The system requires (log 2 n)/M 
binary digits per sample overhead information for receiver synchronization. 
Ready and Spencer use a two-dimensional DPCM system employing 3 adjacent 
samples in its predictor and use a block of 16 samples with four possible 
normalizing constants. They report a 36% reduction in bit rate over a 
similar nonadaptive DPCM system at about 2 bits per sample. The improve- 
ment in performance is less at higher bit rates. 
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A different approach, which has not appeared in the technical liter- 
ature, is a DPCM system with a variable set of thresholds and reconstruc- 
tion levels. This is the self- synchronizing approach used in adaptive 
delta modulators where the step size contracts and expands depending upon 
the polarity of sequential output levels. In a DPCM quantizer, the set 
of threshold and reconstruction levels would contract and expand depending 
upon the sequential utilization of inner or outer levels of the quantizer. 
For instance, a variable quantizer can be designed where all reconstruc- 
tion levels expand by a factor of P (for some optimum value of P) upon* 
two sequential happenings of the outermost level and they would contract 
by a factor of 1/P upon two sequential happenings of the smallest level. 
This system has the advantage of being completely adaptive and does not 
require any overhead information because the receiver is self-synchonizing. 

2.4.4 Delta Modulation (DM) 

Delta modulation is a simple type of predictive coding system and 
is essentially a one-digit DPCM system. The delta modulator attempts to 
transmit the quantized difference between successive samples of the signal 
rather than the samples themselves. The output of a prediction circuit 
is differenced with the signal-; the difference is quantized- and encoded 
into a PCM sequence. 

With linear or simple delta modulation,- the quantizer contains 
only two levels and the predictive circuit is an integrator whose output 
is fed back to the input. The integrator output is compared with the 
input signal to produce a difference signal. If, at certain periodic- 
sampling intervals, the difference is positive, a 1 is transmitted; 
otherwise, a 0 is transmitted. A function diagram of the linear delta 
modulator is shown in Figure 13. The receiver decodes the received signal 
by using the integrator (as in the feedback circuit of the delta modulator) 
to reconstruct the signal which may be added to the error signal. If 
slightly better quality is desired, the integrator in the feedback cir- 
cuit may be replaced by a more complex circuit having a number of inte- 
grators whose outputs are combined in such a manner as to enhance the 
prediction function. 

To date, the most comprehensive analysis of delta modulation 
(linear and adaptive) has been carried out by Abate [9]. This 
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investigation was carried out assuming speech-like signals having an 
exponentia‘1 probability function. The investigation considered only 
single integrators in ’the feedback loop. 

An important term in studying delta modulation is the slope loading 
factor, since the delta-mod predictor is attempting to follow the changes 
in the' voice signal. The slope capability (i.e, the ability to follow 
changes in the voice signal) must be greater than the slope of the input 
voice signal. The slope capability is just the product of the step size k 
(the’ quantizer level) and the sampling rate f^. Therefore, the following 
condition must hold if the system is not overloaded, 

kf^ > jf‘(t)| (2.33) 

where |f’(t)| represents the magnitude of the input signal derivative with 
respect to' time. 

The slope overload parameter is defined as 


s = ky/r 


(2.34) 


where 


and - F(aj) = 




D = 


m 2 


0) F((i)) d(i) 


one-sided power spectrum of the signal 

the maximum radian frequency to which the signal is 
bandlimited prior to encoding. 


(2.35) 


Abate presents values for F((o) and s to be used in his analysis; 

also, s i,s defined in terms of the bandwidth expansion factor B, which 

is the ratio of the- bandwidth required of the digital channel to that of. 

the signal. For delta modulation systems, B is one-half the ratio of 

sampling rate to signal bandwidth, or f /2f . 

s m 

Other parameters which are valuable in evaluating the performance 
of a delta, modulator system are 


Ng - granular noise, which is the variance between the 
predicted feedback pulse and the input signal 

Nq = noise due to slope overload. 
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Both Ng and are functions of 's with Ng predominating for large values 
of s and predominating for small values of s . The total noise power 
(quantization noise) is the sum of these two noise contributions. 

In the linear DM system, the quantization noise (Nq. = Ng + N^). is 
sensitive to small changes in the mean power or the signal, and the 
range of s over which S/Nq is near maximum is small. A change in signal 
pov/er produces a change in s . When s is not near an optimum value, 

Nq will not be near its minimum value and the delta modulator will not 
perform well. Thus, it is desirable to force the delta-mod system to 
adaptively vary s as a function of the input signal. The adaptive 
delta-mod system performs this function and v/ill be discussed in the 
next section. 

2.4.5 Adaptive Delta Modulation (ADM) 

In the- previous section, the need for an adaptive delta modulator 
was motivated by considering the signal -to-noise ratio as a function of 
slope overload factor. The need for an adaptive delta modulator and 
motivation for varying the step size can be demonstrated by studying 
equation (2.33). 

The relationship between f^ and lf'(t)[ in equation (2.33) is seen 
to be proportional if k is held constant. Thus, if the deviation of the 
noise signal is Targe, then must be large. , Yet, |f'(t)| varies as a 
function of time so that, to avoid slope overload, f^ would have to be 
selected for the condition where If'(t)| is maximum. Selecting f^ in 
this manner is not efficient since jf'(t)| will have smaller values much 
of the time. The sampling rate f^ must be constant for a digital commun- 
ication system, but k can vary. 

The objective of the adaptive delta modulation system is to main- 
tain optimal slope loading and maximum S/Ng by controlling the value of 
the slope loading factor by varying the step size k . The distinction 
between' different adaptive delta modulation systems -is based on how k 
is varied. Those to be discussed below are the Abate [9] and the High 
Information Delta Modulation (’HIDM) [10] algorithms. 

Before proceeding to the discussion of particular algorithms, it 
is necessary to discuss concepts pertaining to adaptive delta modulation 
systems in general. Essentially, there can be both discrete and continuous 
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methods of varying the step size. The former observes the binary pulse 
sequence at the quantizer output and changes the step size accordingly 
and is called "discrete adaptive delta modulation." This system is 
shown in Figure 14. The latter, called "continuous adaptive delta modu- 
lation," is considered to be applicable primarily to voice communications 
and will not be discussed in this report. 

In the discrete adaptive delta modulation system, the gains K^. 
are chosen as a function of the observation of the sequence of output 
pulses leaving the two-level quantizer (Figure 14). These gains multiply 
the step size so that the maximum slope capability of the system remains 
less than the derivative of if'(t)| [see equation' (2.33)]. Slope overload . 
is no longer the controlling degradation, since the system is able to 
increase its step size at the sampling rate, until the derivative of the 
signal is greater than the maximum slope capability of the system so that 

|f'(t)i > kf^ (2.36) 

where is the maximum gain. Given this situation, a normalized slope 
loading factor can be defined: 

s‘ = • ^2.37) 

With this parameter, many results derived for linear delta modulation can 
be applied to adaptive delta modulation- 

selection of-the step size and the maximum gain are key consider- 
ations in designing, adaptive delta modulation systems. Abate has developed 
the following equations which give the step size as 

k = ^-^ln2B^v^ • (-2,38) 

and the maximum gain as 

^ (2-39) 

where s.| = .smallest value for signal power 
$ 2 .= largest value for signal power. 

Control and selection of .the intermediate values must also be 
determined. It is desirable to control the values based on a sequence 
pulse. This can be justified in two different ways. First, when slope 




Figure 14. Discrete Adaptive Delta Modulator 
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overload occurs, causing suboptimal performance, the quantizer output is 
a series of pulses of the same polarity. In response, a gain K^. greater 
than (the previously used gain) is selected such that the new larger 
step size is K^, multiplied by the smallest step size k , K^.k. If the 
polarity remains unchanged, the step size is incrementally increased to 

until the largest value K^k is reached. The step size 
incrementally decreases when a polarity reversal occurs. In the decoder, 
an identical process takes place. The result of switching these gains 
is to produce a sequence of equally likely pulses of ones and zeroes 
which insures that slope overload will not occur (or is not likely to 
occur) . 

A second way of viewing this situation is from an information 
theory point of view. It is well known (as shown in any information 
theory text) that the greatest amount of information is transmitted from 
a source to a receiver when, at the receiver, the pulse values to be 
determined are equally likely. Thus, attempting to produce a sequence 
of equally likely pulse values by adjusting the step size results in 
.maximizing the information content of the signal, thus reducing the 
redundancy. 

With the Abate algorithm, the gains are set to be linearly- equal 
to the consecutive sequence. of pulses with the same polarity until the 
maximum gain is achieved. This represents linear gain factor increments 
as follows 

K, . 1 

i- « • <2. 

= n for 1 > n . 

For voice transmission, NASA has implemented a modified Abate algo- 
rithm with step sizes chosen as shown in Table 6. The difference between 
this algorithm and the one studied by Abate is the manner in v^hich the 
gain is decreased when a sequence of the same polarity terminates. With 
the regular Abate, the gain is incremented to a smaller value, whereas 
with the modified Abate, the gain goes to the smallest step size k when 
a sequence of the same polah'ty is ended by a reversal. ' Also, the. Abate 
algorithm as -implemented by NASA uses eight different gain values which 
differs from the four used by Abate. 

Another possible choice of gain values which was proposed by 
Winkler [10] and which would appear to be more suitable for image 



Table 6. Step-Size Algorithm {Abate) 

^ . Delta Step-Size 

Encoded Sequence (Unit) ' 

(latest) 

X X X X X .X 0 1 I 

X X X X X 0 1 1 2 

X X X X 0 1 1 1 3 

X X X 0 1 1 1 1 4 

X X 0 1 1 1 1 1 5 

X 0 1 1 1 1 1 1 6- 

0 1111111 7 

11111111 8 

00000000 -8 

10000000 „7 

xlOOOOOO -6- 

xxlOOOOO -5 

xxxlOOOO _4 

xxxxlOOO -3 

xxxxxlOO _2 

xxxxxxlO -I 


don't care, peak-to-peak input analog signal = 256 units. 
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transmission, in which'rapid level changes can occur (at edges of objects), 
is to let the gains vary exponentially with sequences of, pulses of the 
same polarity as follows: ' * - 

l<i - z’"' . (2.41) 

This ; system’ has been called High Information Delta Modulation (HIDM) by 
Winkler. ' ' ■ - ‘ 

2.4.6 Multi-Level Adaptive Delta Modulation 

A conventional delta modulator has two levels .and is, strictly 
speaking, a one-bit DPCM. It is possible to apply the adaptive approach 
of -the previous section to standard DPCM. In this case, the adaptive 
DPCM is often referred to as "multi-level adaptive delta modulation." 

Of particular interest is a tri-state delta modulator where the video 
signal is divided into three main regions- of activity: 

(1) 'Acquisition - The estimate tries to "catch up" with a rapid 

change in grey level. 

(2) - Tracking -The estimate follows a slowly changing grey 

level. 

(3) Transition -The estimate must change from acquisition to 

tracking. 

Overshoots (and undershoots) that are inherently connected’ with • 
fast acquisition behavior of an ADM and are particularly disturbing in 
video transmission can easily be suppressed by the tri-state delta modu-- 
lator as it switches to its tracking mode. Because of its importance 
for video data compression, the tri-state delta modulator is discussed 
in detail in Section 6.0. . 
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,3.0 ANALYSIS-SYNTHESIS METHODS - transform' CODING 
3.1 General 

Even though fine sampling and quantization of an image are essential 
for desirable subjective quality of a digital picture, from the viewpoint 
of a statistician, the information in the picture can be conveyed quite 
adequately without all these variables. On the other hand, one cannot 
simply discard a part of these variates because of their equal statistical 
significance and the adverse effect this would have on the subjective 
quality of the picture. 

■ An approach to this problem. is to transform the image samples to a 
new set of variates that have a complementary degree of significance in 
contributing to both the information content and the subjective quality 
of the resulting picture. Then one can discard the less significant of 
these variables without affecting the statistical information content of 
the picture or causing a severe degradation in the subjective quality of 
the resultant picture. The method of "principal components" is a coordi- 
nate transformation with the above properties. 

To i.llustrate the application of this coordinate transformation to 
pictorial data, consider two samples on a picture. Let X.j and stand 
for the values these two pixels could assume. From the fact that these 
variables are correlated, one can easily conclude that the sample values 
often fall inside the region shown on Figure 15. Now if one rotates the 
coordinate system to positions indicated by and y 2 > two new variables 
would be obtained— one having a larger variance than the other, though 
the sum total of the variances is invariant under the transformation. 
Performing the rotation of coordinates with all pixels, one obtains 
N -new v^ariates that are uncorrelated and have monotonically decreasing 
variances. Indeed, the method of principal components, also known as the 
discrete Karhunen-Loeve Transformation, results in variates with a maximum 
compaction of energy in the first M components that one desires to keep. 
Since the information content of the sampled picture is invariant' under 
this linear transformation, and the variance of a variable is a measure 
of its information content, the coding strategy should be first to discard 
variates with low variances. Then, since quantizing each number corre- 
sponds to approximating its amplitude with the nearest rational number 
and thus corresponds to a loss of information, one should attempt to 
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X X X X x^x 
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Coordinate System for the Grey Levels of Tv/o Adjacent 
Pixels X] and X2- The most likely values are the ones 
in the shaded area, y^ and y^ are the new coordinate 
tern. 
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impose less of a distortion on those variates having larger variances. The- 
above transformation corresponds to a matrix transformation of a vector. 

For a two-dimensional data (image) it is accomplished by ordering the 
sampled data in a vector form and performing the transformation. The 
block diagram of this transformation is shown in Figure 16. Denoting the 
data vector and the transform vectors by X and Y and the error introduced 
by quantization with vector Q, the vector X* reconstructed at the receiver 
is 

X* = x + fl-’q . (3.1) 

where A corresponds to the operator in the method of principal components. 
Then the resulting mean squared error is 

= tr E{qq''')A‘'’^} (3.2) 

where tr denotes the trace of a matrix. Since AX is an uncorrelated 

vector, it follows that Q is also uncorrelated; thus, E{QQ^} is diagonal 

with the i- element in its diagonal referring to the variance of the 

quantization error in the i— ” component of the Y vector. It is shown 

that the overall coding error is minimized if binary digits are assigned 

to various components of the transformed vector Y so that the variance of 

the quantization error a for all components ,of Y are identical. Since 

M 

the variance of the quantization error is shown to be directly proportional 
to the variance of the variable being quantized, an equal quantization 
noise for various components of Y implies assigning more bits to the com- 
ponents with large variances and fewer bits to those with small ones. Of 
course, assigning zero bits to a particular component of Y means that the 
particular component is ignored in the process of the transformation 
storage. Note that the above may not be possible since the number of 
binary digits assigned to each sample must be an integer. Thus, only an 
approximate solution is possible in practice. 

Elimination of transform coefficients is generally done either on 
s zonal filtering basis (in which a fixed number of coefficients with 
largest energy are retained) or on a threshold filtering basis (in which 
those coefficients which exceed a fixed threshold are retained). Those 
coefficients which are judged to be less significant than others may be 
coded using fewer bits or they may be discarded entirely. 



51 



A . . Q 

Figure 16. Block Diagram of Transform Coding System 



52 


As indicated' in Figure 17, a set of N video samples may be used 
to derive a set of N transform coefficients. The generalized transform 
relationship is 

N-1 

S(t) = I n A (t) , 0 < t < T (3.3) 

n=0 " 

where = transform coefficients 

= set of orthogonal basis functions. 

Of the N transform, coefficients which are calculated, it is assumed 
that only M (-MjiN) of these are transmitted,, and that more bits may be 
used to represent those transmitted coefficients that are deemed to have 
more significance. 

The set of orthogonal basis functions that are used to define 

a particular transform is practically unlimited. It is known that the 
Karhunen-Loeve transformation is optimum in the mean-square-error sense, 
but it- is also known that other suboptimum transformations are computa- 
tionally simpler due to the availability of fast algorithms. 

3.2 Walsh-Hadamard Transform Techniques 

A Hadamard matrix is an orthogonal matrix in which all of the ele- 
ments take the values +1 or -1. One particular subset of Hadamard 
matrices is the Walsh-Hadamard, for which there exists a fast transform 
algorithm which reduces the computational requirements. Figure 18 illu- 
strates a set of one-dimensional Walsh functions of length 2^ = 8, while • 
Figure 19 shows the normal Hadamard matrices of order N = 2, 4, and 8. 

By rearranging the rows of the Hadamard matrices in sequency order 
(sequency = 1/2 x number of sign changes, rounded up to nearest integer), 
the set of discrete Walsh Function Matrices shown, in Figure 20 is obtained. 
The rows of these matrices can be used as the orthogonal basis functions 
for ‘the one-dimensional Walsh-Hadamard transform, which is illustrated 
in. Figure 21 . 

F.igure 22 provides an example application of the one-dimensional 
Walsh-Hadamard transform to a set of four data samples (.Sq,. , Sg, ’S^) 

to obtain the four transform coefficients Cq, C.j , C^, and while 
Figure 23 shows how the original data samples can be recovered from the 
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N = 2 
N = 4 

N = 8 




H - H2 li2 

H 2 - H 2 

'l 1 1 1 

1-1 1-1 
1 1 - 1-1 
1 -1 -I 1 


1 

1 




1-1111111 - 0 
1 - 11 - 11 - 11-1 4 

1 1 • -1 -1 1 1 -1 -1 2 

1 - 1 - 111 - 1-11 2 

1111 - 1 - 1 - 1-1 1 

I - 11 - 1 - 11-11 3 

II - 1 - 1 - 1-111 1 

1 -1 -1 1 -1 1 1 ■ -1 3 


.Figure 19. Hadamard Matrices 
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Figure 20. Discrete Vialsh Function Matrices 
(Hadamard Matrices with Rows in 
Sequency Order) 
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k = 0 


IN MATRIX NOTATION: 

[C] = ^[WAUkJJ] [S] 

Nxl NxN nIo 

[S]' [WAL(k,J)] [C] 

Figure 21. Finite One-Dimensional Walsh-Hadamard Transform 
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TRANSFORM COEFFICIENTS: 

^0 " } ^^0 WAL(0,0) + S^ WAL(0,1) + $2 WAL(0,2) + S3 WAL(0,3)] 

Sq + + Sg + S3 ^ Q 

4 

C] = [Sq,WAL(1,0) + S^ WAL(1,1) + $2 WAL(1,2) + S3 WAL(1,3)] 

= ^O ~ ^2 “ ^3 _ 2 




[S] 


[S] 


= [WAL(k,j)] [c] 


1111 


1.5 

1 1-1-1 


0.5 

1-1-1 1 


0 

1-1 1-1 


0 


2 

2 

1 

1 


Figure 23.- Example of Inverse Walsh-Hadamard Transform 
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transform coefficients. Obviously, from this example, it can be seen that 
the and C^- coefficients need not be transmitted over the channel. 

3.3 Other Transform Techniques 

In addition to the Walsh-Hadamard transform, there are several 
other transforms worthy of note. The Karhunen-Loeve transform (KLT) is 
the optimum in terms of mean-square-error. Any transform can be expressed 
in the form of (3.3) where 4’p(t) is the set of ■ orthogonal basis functions. 
The basis functions for the KLT are the characteristic functions of the ' 
integral equation: 

R(t,r) ({.^(r) dr = \a^\^ <f,^(t) , (3.4) 


where R(t,r) = E{S(t) ,S*(r)} = the autocorrelation of the video signal 'SCtJ. 
The solutions to (3.4) are the characteristic values and the char- 

acteristic functions (|)^(t). Mote that 

rT 

dt = 1 , if m = n 

= 0 , if m n (3.5) 


and the transform coefficients are given by 


where 


S(t) (f.* (t) dt 

-b • 


(3.6) 


= 1 , if m = n 

= 0 , if m / n .. (3.7) 

The difficulty in us.ing KLT is solving (3.4) to determine charac- 
teristic functions. Therefore, other suboptimum transforms are considered 
that are computationally simpler. 

The Fourier transform is often used. The basis functions (f) (t) 
for the Fourier transform are 


♦„(t) = 


and the transform coefficients are given by 


(3.8) 
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1 

T 


fT 


•0 


s(t) e 


- j no)t 


dt . 


(3.9) 


If the video signal is sampled every t seconds, where Nt = T, then the 
discrete Fourier transform (OFT) is given by 


where 


C 


n 


I- s(kt) 
k=0 


Q 


Nt 


(3.10) 


(3.11) 


and s(kx) is sampled video signal. The inverse DFT is given by 


s(kT) 


N-1 

n=0 ^ 


gjfixnk, 


(3.12) 


One of the advantages of the DFT is the existence of a fast algo- 
rithm, (FFT) to compute the transform [11]. The discrete cosine transform ' 
(DOT) can also be computed using the FFT algorithm. The performance of 
the DOT in the mean square error sense is close to the optimum performance 
of the KLT . The transform coefficients are given by 


I s(kx) 


N 


k=0 




and the inverse. DOT is defined as 


^ i . Y C, cos . 


(3.13) 


(3.14) 


Using various transforms in conjunction with scalar Wiener filter- 
ing, and unity signal -to-noise ratio (white zero mean noise), the perform- 
ance of the DCT is very close to that of the optimum KLT [12] as shown 
in Figure 24. Note that the DCT is significantly better in a mean square 
error sense than the DFT or the Wal sh-Hadamard transform. 



Mean-bquare Error 



Figure 24'. Mean-Square Error Performance of-Various 
Transforms for Scalar Wiener Filtering 
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In considering using transforms for video signals, it is worth' noting 
that a major attribute of an image transform is the transform compaction 
of the image energy to a few of the transform domain samples. A high 
degree of energy compaction will result if the basis vectors of the 
transform matrix "resemble’’ typical horizontal or vertical lines of an 
image. If the lines .of a typical monochrome image are examined, it will 
be found that a large number of the lines are of constant grey level over 
a considerable length.. The Fourier and Hadamard transforms possess a 
constant valued basis vector that provides an efficient representation 
for constant grey level image lines, while the Karhunen-Loeve transform 
has a nearly constant basis vector suitable for this representation. 

Another typical image line is one which increases or decreases in bright- 
ness over the length in a linear fashion. None of the transforms pre- 
viously mentioned possess a basis vector that efficiently represents 
such image lines. 

Shibata and Enomoto have introduced orthogonal transforms containing 
a "slant" basis vector for data of vector length's of four and eight [13]. 

The slant vector is a discrete sawtooth v/aveform decreasing in uniform 
.steps over its length, which is suitable for efficiently representing 
gradual brightness changes in an image line. Their work gives no indi- 
cation of a construction for larger size data vectors, nor does it exhibit 
the use of a fast computational algorithm. 

With this background, an investigation was undertaken to develop 
an image-coding slant-transform matrix possessing the following proper- 
ties: (1) orthonormal set of basis vectors; (2) one constant basis 

vector; (3) one slant basis vector; (4) sequency property; (5) variable 
size transformation; (6) fast computational algorithm; and (7) high energy 
compaction. Figure 25 presents the slant transform basis functions [14] 
for ,N= 16. 

Figure 26 contains a plot of mean-square error as a function of 
block size for several transforms for coding with an average of 1.5 bits/ 
pixel. The figure indicates that the performance of the slant transform' 
is quite close to the optimal Karhunen-Loeve transform. It is possible 
'to achieve a slightly lower mean-square error for a given channel rate by 
employing Huffman coding of the quantized coefficients rather than constant 
word-length coding, but the coder will be much more complex to implement. 
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Figure 26. Mean-Square Error Performance of Image Transforms as a Function 
of Block Size 
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4.0 MULTIDIMENSIONAL TECHNIQUES 

4.1 General 

It is possible to achieve greater reduction in video redundancy by 
coding over the two spatial dimensions and the temporal dimension. It is 
straightforward to extend the one-dimensional data compression techniques 
to multidimensions just by applying them sequentially to each dimension. 

In fact, hybrids of various techniques could be used. For example, analysis 
of the transform and DPCM image coding techniques has disclosed that each 
possesses attractive characteristics and some limitations [15,16]. Trans- 
form coding systems achieve superior coding performance at lower bit rates; 
they distribute coding degradation in a manner less objectionable to a 
human viewer, show less sensitivity to data statistics {picture-to-picture 
variation), and are less vulnerable to channel noise. DPCM systems, on 
the other hand, when designed to take advantage of spatial correlation of 
image data, achieve a better coding performance at a higher bit rate. 

The equipment complexity and the delay due to the coding operations are 
minimal. Perhaps the most desirable characteristic of DPCM is the ease 
of design and the speed of the operation that has made it possible for 
DPCM systems to be used in coding television signals in real time. The 
limitations of DPCM are the sensitivity of even wel 1 -designed systems to 
picture statistics and the propagation of channel errors in a coded 
picture. 

A hybrid coding system that combines the attractive features of 
both transform and DPCM coding has been developed [16]. This system 
exploits the correlation of the data in the horizontal direction by 
taking a one-dimensional transform of each line of the picture, then 
operating on each column of the transformed data using a one-element 
predictor DPCM system. Since-the unitary transformation involved is a 
one-dimensional transformation of individual lines of the pictorial data, 
the equipment complexity and the number of computational operations are 
considerably less than that involved in a tv/o-dimensional transformation. 
Theoretical and experimental results indicate that the hybrid system has 
good coding capability— one that surpasses both DPCM .and the transform 
coding systems. 



4.2 Two-Dimensional Spatial Techniques 
4.2.1 Tv/o-Dimensional DPCM 
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Two-dimensional DPCM (2D-DPCM) can easily be implemented. In this 
case, the picture is represented by rows (lines) and columns. The estimate 
of a picture element is based on the previous adjacent picture element In 
the row and column as shown in Figure 27. 

Figure 28 illustrates the functional operation of the 2D-DPCM 
encoder. Incoming digitized video information is brought in, and the 
difference between it and its predicted value is quantized by the fixed 
quantizer. The quantizer contains 8 symmetrically distributed cutpoints, 
each- having its associated output value. The output of the fixed quantizer 
is recorded for transmission, combined with synchronization information 
in a multiplexer, and supplied to the transmission link. Additionally, 
the fixed quantizer output is summed with the previous adjacent sample 
values, and stored for one sample time in a latch. Simultaneously, the 
present and previous samples from the line under consideration are each 
scaled, and their difference stored in the Line Store Memory. Gating 
logic is provided such that samples are stored only during the active 
horizontal line time, and not during' retrace. The output of the Line 
Store Memory, representing the adjacent samples from the previous line, 
is summed with the scaled output from the latch (representing the adjacent 
sample from the line under consideration), and is subtracted from the next 
adjacent sample on the line under consideration. 

Figure 29 illustrates the functional operation of the Multiple 
Loop Adaptive 2D-DPCM Encoder. This system utilizes four nonadaptive 
loops as previously shown in Figure 28. However, each loop is fitted 
with a different quantizer. The output of each loop is stored in a block 
storage register (16 samples). The Block Select Logic continuously cal- 
culates the mean squared error for each block, at the sample rate. Upon 
completion of a block of data, the Block Select Logic selects for trans- 
mission the data block having the lowest mean squared error. 

Figure 30 illustrates the sitigle-loop block-adaptive 2D-DPCM 
encoder which utilizes multiple quantizers. Here, a 16 sample (data 
block) delay is provided, while the Gain Computation Logic determines 
the mean squared error and selects one of eight different quantizers 
within the Compressor Loop. 
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Figure 27.- Block Diagram of Linear Predictive Coder and 
the Pixels Employed in Predicting Pixel 







Input 

8 MHz 
6 or 8 



Figure 28. Norir-Adaptive 2D-DPCM Encoder 
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Figure 30. Single Loop Block Adaptive 2D-DPCN Encoder 
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4.2.2 Two-Dimensional Transforms 

Two-dimensional transforms are also possible. The forward and inverse 
two-dimensional Karhunen-Loeve transformations are given by 

N N 

= I I «l>,-,-(x,y) u(x,y) (4.1) 

Id y=l x=l 

N N 

u(x,y) = I I <J>..(x,y) u.. . ■ ' ’ • (4.2) 

i=l j=l Id ■ . 

It has been shown that, the "eigenmatrices" of <}).jj(x,y) of the whole picture 
or subblocks of the picture that are composed of NxN pixels can be formed 
from outer products of eigenvectors of the covariance of the data in the 
horizontal and vertical directions if the covariance of the data is separ- 
able and of the form 

R(x,y,x,y) = R|^(x,x) Ry(y,y) , (4.3) 

where R^ and refer to the covariance matrices of the data in the hori- 
zontal and vertical directions, and R(x,x,y,y) is the covariance "tensor" 
of the data. In the absence of this assumption, the ordering of the two- 
dimensional data in a vector form is the only practical solution. Compu- 
tations indicated by (4.1) and (4.2) correspond to operation on the rows 
of the image followed by operations on the columns of the horizontally 
transformed data to obtain the tvjo-dimensional transformation. Approxi- 
mately N multiplication/addition operations are required to perform the 
transformation. Often, the two-dimensional Karhunen-Loeve transformation 
is obtained for the Markov process covariance function 

R(x,x,y,y) = exp (-alx-x| - gly-y | ) (4.4) 

where a and 3 are estimated from the image. 

The shortcomings of the method of principal components are the 
large number of operations required for forward and inverse transforma- • 
tion of (4.1) and (4.2), estimation of the covariance of the data, and 
calculation of the eigenmatrices. To eliminate these difficulties, a 
number of other transformations have been considered. 

For discrete data, the two-dimensional Fourier transformation 
corresponds to choosing the basis matrices (images) of the form 
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exp 


— hr — (ix + oy) 




(4.5) 


while the two-dimensional Hadamard transform [17,18] corresponds to choosing 
basis images as 

= 1 (-1)'= (4.6) 

where 

log2 N-1 

c = [b^(x) bj^(y) +b^(i) b^(j)] (4.7) 


and the hth bit in the binary representation of (•) and N is 

a power of 2. Both- of the above transformations are members of a class 

of Kronecker matrix transformations that have (2N^ log, N^) degrees of 

7 ^7 

freedom, and can therefore be implemented by (2N log2 N ) computer oper- 
ations. These transformations remedy the shortcomings of the Karhunen- 
Loeve transformation by eliminating the necessity of finding an operator 
matched to the covariance of the image and significantly reducing the 
computational complexity. 

In addition to these transformations, a number of others possessing 
the above two properties -have been considered. For example. Cosine [12] 
and Slant [13,14] transformations have resulted in a better mean square 
error performance than either, the Fourier or Hadamard transformations. 

The performance of these transforms is still inferior to the performance 
of the Karhunen-Loeve transformation, which is the only orthogonal trans- 
form that generates a set of uncorrelated signals. However, in most 
practical applications, the computational simplicity and the ease of 
implementation of the Hadamard and other suboptimum transformations more 
■than compensates for the suboptimal performance of the transforms. 

As an example of using two-dimensional transforms, consider the 
two-dimensional Walsh-Hadamard transform shown in Figure 31. The video 
data [S] is an NxN matrix: 



4 



X 

0 1 


Black represents +1/N 
White represents -1/N 


Figure 31.. Two-Dimensional Walsh Basis Pictures 
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[S] 


^00 

^01 

^0,N-1 

^10 

^11 

^1,N-1 

^N-1 ,0 




^N-l,Mrl 


and the Walsh-Hadamard transformed data is 


CC] 


O 

O 

O 

^01 

••• S,N-1 

r 

^10 

^11 

• • • *^1 ,N-1 

^N-1 ,0 


r 


N-1, N-1 


where the elements of £ are given by 

1 N-1 N-1 

hi. " ^ 'i- I WAl-{k,1) WAL(JI,J) 
1=0 0=0 


k,£ = 0,1 . ,N-1 

Similarly, by the inverse transform, 

N-1 N-1 

S-. = I I C, WAL(k,i) WAL(£,j) 

k=0 =0 


or in matrix notation 


US--= 0,1,..., N-1 


[C] = ^ [WAL(k,j)f [S] [WAL(k,j)] 


(4.8) 


(4.9) 


(4.10) 


(4.11) 


(4.12) . 


[S] == [WAL(k,j)] [C] [WAL(k,j)f . • (4.13) 

Figures 32 and 33 present a numerical example of a two-dimensional Walsh- 
Hadamard transform and its inverse. An alternate approach to the two- 
dimensional Walsh-Hadamard transform is to use a one-dimensional transform 
first on the rows of [S] and then on the columns of the resulting matrix 
[y]. The previous numerical example is repeated in Figure 34. Note that 
the resulting transform matrix [C] is identical to the matrix [C] in 
Figure 32 using the two-dimensional transform; 
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tS] = 



2 2 
2 1 
1 1 
1 ] 


1 

1 

1 

.1 


TRANSFORM COEFFICIENTS: 



Figure 32. Example of Two-Dimensional Walsh-Hadamard Transform 
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[S] - [WAL(k.J)] [C] [WAL(k,j)f 


21/16 

3/16 

>- 1/16 

1/16 

5/16 

3/16 

- 1/16 

1/16 

1/16 

- 1/16 

- 1/16 

1/16 

1/16 

- 1/16 

- 1/16 

1/16 


28/16 

4/16 

- 4/16 

4/16 


1 

1 

1 

1 

24/16 

8/16 

0 

0 

' 

1 

1 -1 

-1 

1 

0 

0 

0 


1 ■ - 

1 -1 

1 

1 

0 

0 

0 


1 - 

1 

1 

-1 



Figure 33. Example of Inverse Walsh-Hadamard Transform 
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“ 


" 

1111 


1111 

1 1-1-1 

1 

1 1 ^1 -1 

1-1-1 1 

4 

1-1-1 1 

1-1 1-1 

. 

1 -1 1 -1 


s 



1 - D I Y 

1 - D 

C 

L— ^ 

1 

TRANSFORM ’ 

■TRANSFORM 



(ROWS OF S) 

(COLUMNS OF Y) 



EXAMPLE: 


[S] 


2 2 2 1 

2 2 11 

1111 
-1111 


•CALCULATION OF [Y]: 


1111 


'2 


7/4 

1 1-1-1 


2 


1/4 

1-1-1 1 


2 


- 1/4 

1-1 1-1 


1 


1/4 

1111 


2 


6/4 


1 1-1-1 


2 

_ 

2/4 


1-1-1 1 


1 


0 


1-1 1-1 


1 


0 

. 



[Y] 


7/4 1/4 - 1/4 1/4 

6/4 2/4 0 0 

TO 0 0 

10 0 0 


Figure 34. Alternate Two-Dimensional Walsh-Hadamar.d Transform Approach 
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EXAMPLE (continued): 
CALCULATION OF [C] : 



Figure .34 (continueci) 
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To illustrate data compression, consider the video data 

2 211 
2 2 11 

1 1 , 1 1 

1111 

The transformed matrix is 

20/16 4/16.0 0 

4/16, 4/16 0 0 

0 0 0 0 

0 0 0 0 

Since only four elements of [C] are nonzero, then only these four elements 
need to be transmitted.- With only four elements received at the decoder, 
the other elements are assumed zero and the matrix [S] is perfectly recon- 
structed. However, assume the video data [S] is the same as that used 
in Figures 32 and 34; then the transformed matrix [C] does not have only 
four nonzero elements. However, if only the largest four coefficients 
are transmitted, then at the receiver, the other- coefficients are assumed 
zero. In this case, the reconstructed video data is 

2 1.25 1.25 

2 1.25 1.25 

1 1 1 

1 1-1 

Mote that the difference between the video data [S] in (4,14) differed from 
the data in Figures 32 and 34 in the upper righthand corner of the 
matrix [S]. Thus, the assumption of all zeros except for four coeffi- 
cients causes a smearing of the reconstructed data in the upper righthand 
corner. Practical systems for high-quality video transmission would prob- 
ably transmit approximately 10 coefficients. 
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4.2.3 Hybrid Two-Dimensional Techniques 

An attractive hybrid two-dimensional technique is to use a one- 
dimensional transform on each line of the picture and then operate on 
each column of the transformed data using a one-element predictor DPCM. 
system. In the hybrid system shown in Figure 35, image data is scanned 
to form N Tines and each line is sampled at the Nyquist rate. This 
sampled image is then divided into arrays of M by M picture elements 
u(x,y) where x and y index the rows and the columns in each individual 
array so that the number of samples in a line of images is an integer 
multiple of M. The one-dimensional unitary transformation of the data 
and its inverse are modeled by the set of equations: 

M 

u.{y) = I u{x,y) A.fx) ; i=l,2,...,M (4.17) 

x=l ^ y=l,2,...,N 

M 

u{x,y) = u.(y) <|).(x} , (4.18) 

where <j>^(x) denotes a set of M orthonormal basis vectors. Since the cor- 
relation of samples in various columns of the transformed array is different, 
a number of different DPCM systems are used to encode each column of the 
transformed data. 

The performance of hybrid encoders using various transformations 
is shown in Figure 36 for M=16 and N= 256 [18]. Performance of the 
two-dimensional Hadamard and two-dimensional DPCM encoders is included 
for- comparison. This figure clearly shov/s the superior performance of 
the hybrid encoder over both the two-dimensional Hadamard and two- 
dimensional DPCM encoders. 

The optimal hybrid coder shown in Figure 35 utilizes a set of 
different weighting coefficients A-j ,A 2 , . . . in the DPCfl predictors at 
the transmitter and receiver. Also, the quantizers in the DPCM systems 
are designed based on the statistics of the video data. To simplify the 
encoder and permit its design to be independent of ’the signal statistics, 
a suboptimal system has been developed which uses a common- value for A-| 
through A^^^, and which uses some general statistics (obtained from a number 
of typical pictures) to obtain variances of the differential signals w.(y). 






















Bit Rate 



Ratio, dB 


of Proposed Hybrid Systems. The performance ■of 
using two-dimensional Hadamard transform and’ a 
n. ' ' , ' 
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Then only one DPCM encoder can be used to encode all transform coefficients. 
In the block diagram of the simplified hybrid encoder shown^ in Figure 37, 
a single analog-to-digital converter performs the quantization in all 
DPCM encoders. Differential signals corresponding to various transform 
coefficients are normalized by dividing them by an appropriate amplitude 
factor a. before being processed by the A/D converter. The system is 
designed to have a maximum of four bits per coefficient. The bit' assign- 
ment procedure is programmed in the bit selector which selects 0 to 4 
bits per coefficient in a predetermined manner. The nonlinearity of the 
quantizer is achieved by using a set of fixed nonuniform threshold levels 
in the A/D converter. The D/A converter uses the corresponding set of 
nonuniform reconstruction levels. 

The simulated results [18] show that performance of the simplified 
hybrid encoder is only slightly inferior to the performance of the optimum 
hybrid encoder. The reduction in the signal -to-noise ratio and subjective 
quality were minimal and well worth the resulting hardware simplification. 

4.3 Three-Dimensional Techniques 

In a monochrome television signal, a large fraction of picture 
elements correspond to background material that does not change signifi- 
cantly from one frame to the next, while only a relatively small number 
of picture elements in a frame convey fresh information. From a statis- 
tical standpoint, the similarity of pixels from one frame to the next 
corresponds to a high level of interframe correlation. Thus, the statis- 
tical coding techniques exploiting spatial correlation that have been 
considered for coding single frames of data could, in principle, be 
extended to take advantage of the frame-to-frame correlation, thereby 
further reducing the bit rate required to transmit the data. Indeed, 
some research in the area of three-dimensional Fourier and Hadamard 
transformations has indicated that bit rates can be reduced by a factor 
of about 5 by incorporating the correlation in the temporal direction [19]. 
However, three-dimensional transform encoding systems suffer from the 
serious shortcomings of computational complexity and the requirement for 
large amounts of storage. For this reason, some researchers have avoided 
extending transform coding systems to a third dimension; instead, they 
have suggested suboptimum coding systems that do not require extensive 
amounts of memory or computations. 
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Figure 37. Block Diagram of the DPCM Encoders for the Simplified 
Hybrid Encoder 
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An efficient technique of interframe coding of monochrome television 
images is simply to transmit the grey levels of the elements that have 
changed in successive frames by replenishing the previous frames with the 
transmitted data [20,21]. Experiments with Picturephone signals using 
the conditional frame replenishment technique have indicated good coding 
results for an average of one bit per pixel. A major shortcoming of the 
system is that the data is generated at an uneven rate. This is caused 
by the variation in number of pixels which change beyond a fixed threshold 
in each frame. To transmit this data over a fixed channel requires buffer- 
ing the data prior to transmission. The size of the buffer and the bit 
rate limit the amount of motion in the video data for which this system 
could be employed. It has been determined that a buffer size of 10 frames 
is needed to transmit television signals v/ith a moderate degree of motion. 
However, the buffer size has been reduced to one frame by transmitting 
only clusters of data. This increases the hardware complexity. 

In many cases, it is straightforward to extend two-dimensional 
techniques to three- dimensions. For example, three-dimensional DPCM is 
illustrated in Figure 38. In this case, the adjacent picture elements in 
the line, in the column, and in the previous frame are used to predict the 
current picture element. 

Three-dimensional Wal-sh-Hadamard transforms are also straightforward. 
Figure 39 presents -the three-dimensional Wals'h-Hadamard transform. To 
use the three-dimensional Walsh-Hadamard transforms for data compression, 
the probability of occurrence of a vector coefficient as a function of its 
amplitude must be measured. The probability distribution of Walsh-Hadamard 
vectors shown in Figure 40 illustrate that large vector amplitudes become 
rare as the vector, sequency increases in the horizontal or vertical direc- 
tion in still pictures. Probabilities of occurrence of large vector ampli- 
tudes also fall off with increased temporal sequency, but not as rapidly. 
This is due to the nature of the source; very fine spatial detail (that 
would create larger values of high sequency spatial "checkerboard" vectors) 
is not as common as motion spanning about 20-50 subpictures in four frames 
(which creates larger values of high sequency' temporal vectors). 

An alternative approach to the three-dimensional Walsh-Hadamard 
encoder is a system that would. use a two-dimensional Fourier transform 
for spatial and a DPCM encoder in the temporal direction. ' 
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Figure 39. 


Three-Dimensional Hadamard Basis Vector Representation 
of a 4x4x4 Pel Subpicture 
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Figure 40'. - Vector Amplitudes as Functions of Horizontal and Temporal Sequency 
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This system would generate the two-dimensional Fourier transform 
of each frame. Referring to the two-dimensional Fourier transform of the - 
J^th frame by f[^{y,v), one can represent Fj^(y,v) by its amplitude and its 
phase, i.e., 


F|^{y,v.) = 


A|^(y,v) 




( 4 . 19 ) 


where Aj^(p,v) and 0|^(y,v) refer to the amplitude and phase planes of the 
_kth- frame. Many types of motions, such as a panned motion, correspond to 
significant changes in the. phase plane and small changes in the amplitude 
plane; thus, for an efficient encoder, one v/ould assign a larger fraction 
of the available binary digits -to changes in the phase plane from one 
frame to the other frame and a smaller number of binary digits to the 
corresponding changes from one amplitude frame to. the other. The other 
attractive feature of this coding method would be to relate the predictor 
in the DPCM feedback loop to the motion of the camera relative to the 
subjects and perform a better prediction for the changes in phase which 
are caused by the motion. The complexity of this system is significantly 
less than for the three-dimensional Walsh-Hadamard encoder. 
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5.0 COLOR TECHNIQUES 

5.1 NTSC Color Television System 

Color picture reproduction relies mainly on the principle of three 
primary color decompositions, where the picture is sampled through red, 
green, and blue filters. Hence, one may treat a color signal as a vector 
of three separate monochrome signals corresponding to the red, green, and 
blue contents of the object picture. If the color signal is transmitted 
in this manner, it would require three times the bandwidth needed for the 
monochrome transmission. As in the monochrome case, statistical redun- 
dancies exist not only as the spatial or time statistical correlations 
for each color, but also in the form of intra-color redundancy. Data 
compression can be accomplished in three separate steps. First, the 
bandwidth can be reduced due to the intra-color redundancy, then the 
spatial and time statistical data reduction described in section 2.4 
can be applied directly. The intra-color statistical redundancy can be 
described by considering the random vector, 

S(t) = (R(t), G(t), Bit)Y , (5.1) 

where R(t) = red video signal, 

G(t) = green video signal, 

B(t) = blue video signal, 

as a continuous random process. Bandwidth reduction, in the sense of the 
least expected mean-square error criterion, can be obtained by applying 
the Karhunen-Loeve procedure to the set of recognizable color pictures. 

This normally results in a linear transformation of the original color 
signal vector. 



where M is a 3x3 matrix that diagonalizes the covariance matrix 

E • S(t)] . (5.3) 

This Karhunen-Loeve procedure depends mainly on empirical analysis. Psy- 
chovisual phenomena are not invoked. Other transformations were sought. 
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One of the methods used in the standard NTSC color TV transmission is to 

transform the signal vector ^(t) into the' coordinates- that consist of 

the monochrome brightness or luminance and two vectors that lie on the 
? 

chrominance plane (Figure 41 ). 


where 



(5.4) 


(5.5) 


Y(t) corresponds to the monochrome brightness, I(t) -corresponds to the 
vector in the "orange red-cyan" direction in the chrominance plane, and 


Q(t) corresponds to the vector in the "magenta-green" direction. This 
choice of coordinates has the following advantages: 


(1) Black and white monochrome video signal can be readily 
fonned by simply dropping- the Q and I components. 

(2) Psychovisual phenomenon that the human eyes sense only 
black and white at very low luminosity. 

(3) For small color areas, the human eyes exhibit "tritanopia" 
(or two-color' vision) . This corresponds to the decrease of spatial 
visual sensitivity in the Q component. 


Due tothese facts, color video signals can be transmitted, with 
reasonable picture quality, when Y(t) has a bandwidth of 4 MHz, I(t) has 
a bandwidth of 1.5 MHz-, and Q(t) has a bandwidth of only 0,5 MHz. Standard 
commercial color TV signals are transmitted by modulating the chrominance- ' 
signals, I(t) and Q(t), by a subcarrier ' 


M(t) = I(t) sin w t + Q(t) cos w t 
= C^(t) sin Q)^t + C|^(t) , 


where 


(5.6) 
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and 


CgCt). = + Q^(t) 


(5.7) 

(5.8) 


Cj(t) and Cj^(t) correspond, veryroughly, to the constant hue and constant 
color saturation stream lines in the plane of chrominance. Thus, a typical 
commercial color TV signal can be expressed as 


Y(t) + M(t) = Y(t) + I(t) sin u t + Q(t) cos u t . (5.9) 

'' ^ ^ . 

This composite modulated signal, together v/ith the reference phase of the 

subcarrier frequency, enables the receiver to demodulate Y(t), I(t), and 

Q(t). The corresponding primary signals, R(t), G(t), and B(t), are 

obtained by the inverse linear transformation: 



The general color TV communication scheme-is illustrated in Figure 42. 



Space color television systems use a color camera which is basically 
a black-and-white camera. It has been converted to a field-sequential 
color camera by the addition of a rotating color wheel. This technique 
is very similar to the old CBS field-sequential system developed for color 
television in the early 1940's. This system was characterized by the use 
of color band filters at the camera and again at the receiver, with only 
one camera necessary for viewing a scene and only one picture tube needed 
at the receiver. This camera, like the old CBS camera, employs a color 
filter wheel to produce a serial color signal. 

The field-sequential' system uses a rotating filter wheel to expose 
-the camera's image tube sequentially at the desired broadcast scan rate 
to the red, blue, and green components of a scene. Thus, the need for 
complex optical paths and color registration adjustment, such as required 
in' commercial color cameras, is eliminated. This enables the color camera 
to be lightweight and to require very little power. In addition, it is 
capable of operating in a large dynamic range of light levels. Since 




Figure 42. Modulated Composite Color Signal 
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the output of a field-sequential system is in serial red-blue-green form, 
it is not compatible with present broadcast standards. This requires 
that a ground station color converter be utilized to change the sequen- 
tial color signal to the standard parallel National Television System 
Committee (NTSC) color TV format so it can be rebroadcast by commercial 
stations. 

A diagram of the color television system [22] is shown in Figure 43 
The image is focused by a zoom lens through the color filter wheel onto 
the faceplate of the image tube. To simplify the problem of synchroni- 
zation, the scan rate of the wheel as the color filters pass in front of 
the image tube must be the same as that of the TV networks, which is 
60 fields per second. This is achieved by dividing the wheel into six 
sections, with the colors arranged in red-blue-green, red-blue-green 
order, and by driving the wheel at 10 revolutions per second. The- motor ■ 
speed is held constant by the timing of the camera's sync generator. 


LUNAR BASE PARABOLIC 

EQUIPMENT ANTENNA 



Figure 43. Apollo Color Television System 
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The field-sequential color signal, v/hich is transmitted by an 
S-band transmitter (for Apollo), is picked up and amplified by a receiver 
at the receiving station. The signaT is then clamped' in a processing 
amplifier to; restore' the dc component and reestabUsh the average Tight 
value of the. reproduced image. 

The processed signal is pTaced into a series of two tape recorders 
for the' purpose of compensating for doppler shift and presenting real- 
time information. The sequential color signal is then put into the, scan 
converter that changes the video from the serial- color format to the 
parallel (simultaneous) color format. The scan color converter Is a 
storage and readout device holding the two previous fields in memory 
and presenting the three fields at once at the output of the incidence 
of the third field. As the new field is placed into the memory, the 
oldest field is erased, updating the information at the field rate. 

Thus, the three colors are s.imultaneously read out in the same manner 
as the output from a standard three-tube NTSC color camera. After video 
color conversion, the signal is sent to an NTSC color encoder which 
processes it to form the composite video signal. 

5.3 Demodulation of Color Signals 

Bandwidth compression of television signals exploits the correlation 
of the signal in both the spatial and temporal directions. To bring out 
this correlation, the composite color signal must be first demodulated. 

The field-sequential color signaT is already in demodulated form. Spec- 
tral correlation of the field-sequential signal can be utilized by simply 
digitizing two consecutive fields (i.e,, corresponding to red and green 
components) and storing these in digital buffers. The Y, I, and Q com- 
ponents (for each picture element) are then obtained by forming a linear 
combination of the red and green components w-ith the incoming (blue) com- 
ponent. Demodulation. of the NTSC color signal is more complicated. Two 
popular approaches to this problem are digital color demodulation, using 
comb filters, and NTSC to time domain multiplexed (TDM) conversion. 

5.4 NTSC Digital Color Video Data Compress-ion 

Equation (5.9) enables color video information to be packed in a 
continuous analog signal. Therefore, the monochrome digital video data 
compression' technique can be applied if the mean square error of the 
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decoded signal is sufficiently small so that the phase and amplitude 
errors of the reconstructed signal produce tolerable visual sensational 
errors. This can be done by increasing the sampling rate of the A/D 
converter and assigning more information bits to the quantization of the 
Hadamard components. This method has been experimented with by Enomoto 
and Shibata [13] using 3.75 bits per pel. However, this method is not 
recommended for spaceborne systems because: 

(1) To resolve the phase and amplitude information of the modu- 
lated signal to within tolerable visual sensations, a substantially 
higher sampling frequency must be assigned to the A/D converter. The 
typical requirement in this case is in therange of T2 MHz to 15 MHz. 

A/D converters of this type operate in the range of the present state 
of the art. The corresponding data conversion accuracy, pov/er consump- 
tion, weight and cost make them undesirable for limited environments, 
such as a space vehicle. 

(2.) The corresponding Hadamard transformer and quantizer must 

*> 

also , operate at higher rates. Consequently, power consumption, weight, ■' 
etc., will increase. 

(3) The additional analog modulation-demodulation of the chromi- 
nance signals will contribute additional errors to the overall system. 

A possible digitized color video data compression algorithm sug- 
gested by Linkabit [23] is as follows: 

The three primary color video signals [R(t), 6(t), B(t)] are first 
analog transformed into Y(t), I(t), and Q{t), using simple resistor net- 
work, and videO' inverting .amplifiers. Among these, only the monochrome 
brightness component, Y(t),- conveys most of the picture information and' 
requires highest transmission bandwidth. For video reproduction purposes 
512 samples per horizontal line (approximately 8 MHz sampling frequency) 
and 8 information bits per sample are sufficient. I(t) and Q(t) have 
substantially lower bandwidth requirements (this results mainly from the 
poor spatial response of the. human visual system to the chrominance com- 
ponents). Comparatively, either I(t) or Q(t) has bandwidth requirements 
of less than half that required by Y.(t). This enables us to sample T(t) 
and Q(t) at half the sampling frequency used for Y(t)- while maintaining 
sufficient chrominance resolution. The reduction In horizontal sampling 
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rate is mainly due. to the loss of chrominance response of human eyes to 
higher spatial frequencies. This reason can be applied equally well 
vertically. Consequently, I(t) and Q(t) may share the same A/D converter 
bysampling I(t) and Q(t) at alternate horizontal lines. For ripple- 
free operation, the sampling frequency for the A/D converter used for 
Q(t) and I(t). is synchronized to that used for Y(t) by a simple frequency' 
divider. In addition, since the chrominance resolution of human eyes is 
much less than that of monochrome brightness, a 6-bit resolution seems 
to be sufficient for the I(t) and Q(t) A/D converter. This corresponds 
to 31 or more levels of color purity in the directions of white to cyan, 
white to orange red, white to green, and white to magenta. 

Using the above, the additional encoder front-end analog circuit 

requirements are: a 6-bit A/D converter, a resistor network, and an 

analog multiplexer for the I-(t) and Q(t) signals. For limited environ- ■ 

ments, as in a space vehicle, this seems to be more preferable than to 

compress the component modulated signal. 

> 

In doing so, the digitized color signal can be represented- by a 
lattice of 512x 480 sample points for the monochrome brightness com- 
ponent, Y(t), and a lattice of 256x 240 sample points for the chrominance 
components, I(t) andQ(t). The monochrome brightness subpictures are 
chosen with size of 4x4 samples, and the chrominance subpi.ctures with 
2x2 samples. Larger chrominance subpicture sizes are not used because 
the spatial statistical correlation between sampling points within each 
subpicture is only a function of its physical dimensions. The bandwidth 
reduction obtained by Towering the sampling rate results merely from the 
psychovisual phenqmena. Further, the chrominance subpictures -are made 
to coincide with the monochrome brightness subpictures. This is illu- 
strated in Figure 44. The corresponding digitized color subpictures are 
represented by the following vectors: 
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Cyan-Orange Red Vector 



Green.-Magenta Vector 



(5.12) 


(5.13..) 


where x..^. is the sarjipled brightness value at the (tj) coordinate of the 
subpicture, and the and are the sampled chrominance values. 

This is illustrated in Figure 44 , They have integer representations in 
the following ranges: 


0 < x^.^ < 255 


0<u,j.v.j<63 . 


(5.14) 


Hadamard transformations can be applied to the above. vectors: 



(5. .15.) 


(5.16) 


(5.17) 


where H is the Hadamard transform for 4x4 subpictures, and H‘ is that for 
2x2 subpictures. The transformed range is given as follows: 


0 < C^^.] < 4 X 255 

“2 X 255 <. C.j^ <. 2 X 255 for 1 1 or j 1 


(5.18) 
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0 < ^ 2 X 63 

-63, ^ C.j.jCR :< 63 for i 1 or -j 1 


(5-.19) 


The monochrome video data compression procedures described in sec- 
tions 3.0 and 4.0 can be applied to HY.' lihen two-dimensional ly compressed, 
HY requires 2 bits per pel or, equivalently, 491.52. kbits per frame, 
Likewise, components for H'l and H'Q can be compressed using similar 
logarithmic quantization procedures. In particular, due to the low 
spatial^ frequency response for. the magenta-green chrominance component, 

^■12’ ^21 discarded. This follows from the reasoning that 

using C.j^ alone to approximate the Q(t) component, corresponds to sampling 
Q(t) at one-fourth the sample frequency for Y(t) or, equivalently, sampling 
at 2 MHz. This is substantially higher than the Nyquist rate required 
for the 500 kHz bandwidth of the Q(t) signal used in standard commercial 
color TV signal. Logarithmic DPCM techniques should be applied to the 
C.|.j and c|^.j components in order to extract maximal advantage due to the^ 
residual statistical correlation existing between adjacent subpictures. • 

The' quantization table for chrominance components is given in 
Table 7. The designation for the Hadamard components is shown in 
Figure 45. 

C.|^j and 62^^ have 23 quantization levels using logarithmic DPCM 
methods. 


C-|^2 ^2^-^ have 15 quantization levels. 


C2^2 ® quantization levels. 

C^.j and C.|^.| together have 23 x 23 = 529 possible pairs of represen- 
tative values. To encode them, it would require more than 9 information 
bits. However, when both I(t) and Q(t) have extreme negative values, the 
resultant chrominance coordinate lies well outside the reproducible color 
triangle formed with the three primary color vertices. Hence, by com- 
bining the representative values and deleting the cutpoint pairs (Cj^.|, 

C,^,) outside the reproducible color triangle, as shown in Table 7, cA 
* ‘ 0 . 11 
and C,.! are effectively encoded by 9 information bits. 

II I 

^12’ ^21’ ^22 information bits. The overall bit 

requirement for the chrominance components is 20 bits per subpicture. 

Thus, the resultant two-dimensional ly compressed color data requires 
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Table 7. Quantization Table 
C-j^l and C-j^^ : Quantized by 23 Levels 


Outpoint 

Representative Value 

± 96 

±1 08 

± 74 

± 

84 

± 57 

± 

64 

■± 44 

± 

50 

± 33 

+ 

38 

± 24 

+ 

28 

± 17 

± 

20 

± 11 


14 

± 7 

± 

9 

± 3 

± 

5 


± 

2 


0 


In addition, the following outpoint pairs of are deleted; these 

occur outside the reproducible color triangle: (±96,-96), (±74,-96), 

(±57,-96), (±44,-96), (±33,-96), (±96,-74), (±74,-74), (±57,-74), (±44,-74). 
This enables C-j-j and to share 9 information bits. 

I I 

3nd Cg-j: Quantized by 15 Levels ' 


Outpoint 


Representative Value 


+ 


+ 


+ 




+ 


+ 


± 


39 

29 

21 

15 

10 

6 

2 


± 44 
± 34 
± 25 
± 18 
± 12 
± 8 
+ 4 
0 


ESPRQDUCIBILnT OF THK 
OBIGINAL PAGE IS POOB 



Table 7 (continued) 


I 

22 * 


Quantized by 9 Levels 


Outpoint 

± 26 
± 14 
± 9 

± 3 


Representative Value 
± 32 

± -20 

± 12 

± 6 




and C: 


.1 

22 


have 11 


information bits. 


0 
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Figure 45. Designations of the Chrominence Hadamard Components 
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3.25 bits per picture element (based upon 512x 480 picture elements per 
frame) or, equivalently, 798.72 kbits per frame, Therefore, by transmit- 
ting the color pictures using only two-dimensional ly compressed data, 
the overall bit rate requirement is 23.9616 Mbits per second. 

5.5 Field -Sequential Color Video Data Compression 

The Field-Sequential Color TV system uses a modified monochrome 
TV camera with a rotating color wheel. The rotating color filter exposes 
the camera image tube sequentially, at the commercial broadcast scan rate, 
to the red, blue and green components of a scene. Therefore, sequential 
fields differ in spectral components in addition to the field-to-field 
variations caused by temporal motion. This generates spectral and tem- 
poral correlation in addition to the spatial correlation inherent in all 
pictorial data. However, due to spectral variations, the relative degree 
of the total spectral and temporal correlation of the Field-Sequential 
Color TV is less than the temporal correlation between the sequential' 
fields of monochrome television. Therefore,. one cannot achieve as high 
a bandwidth compression with Field-Sequential Color TV as one achieves 
with monochrome television for the same image fidelity. 

The bandwidth of the Field-Sequential signal is significantly 
smaller than the bandwidth of a standard color TV system. This is because 
the standard NTSC color television uses three .color guns, each with the 
same bandwidth as the camera used in the field-sequential color TV system. 
However, the signal generated by NTSC color television systems exhibits 
more temporal and spectral correlation; therefore, its bandwidth can be 
compressed by a larger ratio. 

The salient feature of the Field-Sequential Color TV signal is. that 
the sequential fields exhibit temporal as \'/ell as spectral correlation 
and exploiting those correlations in addition to spatial correlation is 
essential to the efficient bandwidth compression of the Field-Sequential 
Color TV. Spectral correlation is best utilized by using red, green and 
blue, fields to generate the illuminance (Y) and the chromaticity com- 
ponents (I, Q). These are related to the red, green and blue components 
of- the color signal as follows: 

Y = 0.30 R + 0.11 B + 0.596G 

I = 0.74 (R-Y) - 0.27 (B-Y) 

Q = 0.48 (R-Y) - 0.41 (B-Y) . 


(5.20) 
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Sequential fields of the Field-Sequential Color TV signal are composed 
of the odd and even lines as shown on Figure 46. 

In' combining red, green and blue components to generate Y, I-, and 
Q, one utilizes the correlation of the spectral components for a maximum 
compaction of energy in the illuminance signal. A spectral compaction 
that results from identical signals for the red, green and blue components 
produces a maximum value illuminance and a zero grey level for the chro- 
maticity components. On the other hand, the most dissimilar red-, green 
and blue signals will result in an identical signal for the illuminance 
and the chromaticity components. In Field-Sequential Color TV, the 
sequential fields are composed of odd and even lines. The odd red field 
exhibits spectral similarity with the odd blue and odd green fields. 
However, these samples are separated by a temporal distance of 4/60 of ' 
a second and this causes some spectral decorrelation due to temporal 
motion. The Y, I, and Q signals formed from all-rodd or all-even frames, 
which requires storing a minimum of 4 frames, are particularly susceptible 
to spectral decorrelation from rapid temporal motion. An alternate pro- 
cedure is to mix the odd and even fields in generating the illuminance 
and the chromaticity components. This requires storing only two fields. 
The mixing of the odd and even fields results in a smaller correlation 
among the spectral components but a larger temporal correlation, since 
the three fields used in generating Y, I, and -Q are only 2/60th of a 
second apart. This gives a larger or smaller compaction of energy in the 
illuminance signal depending upon the comparative size of spectral simi- 
larity and temporal motion. 

An approach with attractive implementation properties for data 
compression uses the standard color wheel and substitutes the green field 
for the illuminance signal. Then the chromaticity components are obtained 
by subtracting the green from the red and the blue components. This 
approach is based on the fact that the green spectral component is very 
similar to the illuminance component for TV signals. Also, the green 
component possesses more energy and shows more details than the red and 
blue components. Using the green component instead of the illuminance, 
the transmission tristimulus signals are 



R - G ; 


B - G . 


(5.21) 



B 


Combining Odd and Even Fields Separately to 
Generate Y, I, and Q Signals. 



ry 



(b) Combining Odd and Even Fields to Generate 
Y, I , and Q Signals 

Figure 46. Stetistics of the Field-Sequential G. R-G. and B-G Fields 
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C-| and ^2 possess a much smaller bandvyidth and a smaller fraction of the 
signal energy than the G component; therefore, they can be transmitted 
in a subsampled form utilizing a smaller fraction of the available bit 
rate. ■ ' 

For data compression, the illuminance signal is used directly, 
while the chroma ti city components are subsampled. The chromaticity 
signals can be subsampled by taking every other sample without affecting 
the quality of the reconstructed signal since human vision is rather 
insensitive to color information at high frequencies. Transmitting the 
chromaticity components in subsampled form means that the reconstructed 
composite signal at the receiver has illuminance as well as the chromi- 
nance signals at low frequencies, but contains only the illuminance infor- 
mation at high frequencies. Subjective experiments with television 
viewers at normal viewing distances have indicated that this is, in fact, 
acceptable [24]. 

Prior to subsampling the chromaticity signals, they must be filtered 
to eliminate their high frequency components to prevent aliasing. Although 
there exist sophisticated filtering techniques to eliminate the high fre- 
quency components .of discrete signals, experiments with imagery data has' 
shown that a simple 3-point "banning" filter can be used with comparable 
results. A 3-point banning filter uses weightings of 1/4, 1/2 and 1/4 
to obtain the filtered signal as follows: 



lx 

4 ^,3-1 



+ 


1 X 

4 ^-,j+l ’ 


(5.22) 


A 

where X..^ is the lowpass filtered form of X..j. This filter is particu- 
larly attractive for digital signals since the multiplications can -be. 
performed by shift operations. In the proposed system, the chromaticity 
signals are filtered by the 3-point banning filter prior to a 2-to-l 
subsampling of these signals'. 

A detailed study by TRW [18] of bandwidth compression algorithms- 
that process the G, R-G and B-G have led to three candidate techniques. 
These are two-dimensional DPCM, adaptive two-dimensional DPCM, and the 
Hybrid system combining a Hadamard Transform with a DPCM encoder. 
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5.5.1, Two-Dimensional DPCM System 


Two-dimensional DPCM systems were discussed in Section 4.0. Using 
a third-order fixed predictor, the picture element X-^ is predicted using 
a linear combination of the adjacent element on the same line, the adja- 
cent element on the same column, and the element diagonally across from 
X.. as follows: 

* J 


(.j = 0*75 0.75 X.^._^- 0.5 X. 


i-1, j-1 


(5.23) 


The fixed values of 0.75, 0.75 and -0.5 have been used for the weightings 
of the predictor since these weights provided the best overall results 
in .the simulation studies. In addition, digital multiplication by these 
numbers can be performed by simple shift-and-add operations. The DPCM 
encoder uses two quantizers. One consists of 8 quantization levels and 
is used to encode the green signal component. The other consists of 4 
■quantization levels and is used to encode the chromaticity components. 

The outpoints and output levels in the quantizer are selected for the 
best performance as measured by mean square error and the subjective 
quality of the reconstructed imagery. The two quantizer characteristics, 
which are symmetrical, are shown in Figure 47. For convenience, only the 
positive portion of the quantizer characteristics is shown in the figure. 

5.5.2 Adaptive Two-Dimensional DPCM Systems ■ 

Two adaptive two-dimensional DPCM systems are very promising. These 
are the block-adaptive DPCM encoder using multipTe prediction loops and 
the block-adaptive DPCM system that uses a single prediction loop with a 
quantizer characteristic controlled by an auxiliary gain computation loop., 

Bl ock-Adaptive DPCM Encoder Using Multiple Loops . The block 
diagram of the encoder is shown in Figure 48. Each DPCM loop uses a 
third-order predictor defined by equation (5.23). The quantizer in each 
loop has a different characteristic. The four quantizers are scaled ver- 
sions of those shown in Figure. 47. The scaling constants- are 1/2, 1, '2, 
and 4. Each DPCM loop stores a block of 16 encoded samples in the shift 
registers and computes the total encoding error. The block select logic 
compares the distortion (total encoding error) for each loop, and transmits 
the contents of that shift register which corresponds to the smallest 
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(a) 2-Bit Quantizer 



Figure 47. Positive Outpoints and the Reconstruction Levels 
of the Quantizers for the 2D-DPCM System 
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Figure 48.. Block-Adaptive 2D-DPCM System Using Multiple Loops 
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distortion. The receiver needs information as to which DPCM loop was 
utilized for- each block of data; therefore, two bits of overhead infor- 
mation are transmitted with each block indicating which loop (quantizer) 
was selected. This increases the bit rate by l/8th of a bit per sample. 

2. Block-Adaptive DPCM Encoder Using a Single Loop . In principal, 
this technique is similar to the adaptive DPCM system with multiple loops. 
Here, a block of 16 samples is used in the gain computation loop (without 
a quantizer) to generate an estimate for the variance of the differential 
signal as shown in Figure 49. Depending upon the value of the variance, 
one of M gain factors is selected and used to scale the quantizer charac- 
teristic in the prediction loop. Eight possible gain factors (1/8, 1/4, 
1/2, 1, 2, 4, 8, 16) are used in this system, requiring 3/16th of a bit 
per sample for transmitting the overhead information. 

5.5.3 Hadamard Transform/DPCM System 

Hybrid encoders use a concatenation of a unitary transform (i.e., 
a Hadamard transform) and a DPCM encoder (Figure 50). A block size of 
four picture elements is used for simple implementation. Each Hadamard 
coefficient (Hg, H-j , H^, H^) is encoded with a DPCM loop using a one- 
element predictor with a fixed weighting coefficient. The bit assignment 
and weight coefficient for DPCM loop encoding the green signal and the 
chromaticity components are listed in Table 8/ The quantizer in the 
DPCM loops is similar to those on Figure 47. However, the scaling of 
the threshold and reconstruction values for each loop are different. 
Scaling values of 2, 1, 1/4 and 1/4 are used for the DPCM loops encoding 
Hg, H.J , H 2 , and Hg, respectively. 


Table 8. Bit Assignment and Weighting Coefficients per DPCM System 

in Hybrid Encoder 


Type of 
Signal 

Parameters 

DPCM for Hg. 
Coefficient 

DPCM for H] 
Coefficient ; 

DPCM for H 2 
Coefficient 

DPCM for Ho 
Coefficient 

Green 

Bits/Sample 

4 

3 

3 

2 

Field 

Weighting 

Coefficients 

7/8 ■ 

3/4 

3/4 

1/2 

Chroma- 

Bits/Sample 

3 

2 

1 

0 

ticity 

Weighting 

Coefficients 

3/4 

3/4 

L 

1/2 

1/2 













Encoded 

Data 


cn 


Figure 50. Hybrid Hadamard/DPCM Encoder 
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6.0 TRI-STAm DELTA MODULATOR (TSDM) 

6.1 Introduction and Overview 

A technique that was considered of major importance for TV data 
compression is delta modulation and, in particular, tri-state delta 
modulation. 

A conventional delta modulator (CDM) encoder extracts only the 
direction of changes in the estimate of the input signal for transmission 
through the channel. Because of this, such a modulator is simply a two- 
state, one-bit-per-sample data compression scheme that obtains a better, 
bandwidth utilization efficiency than pulse code modulation (PCM). A 
significant effort has therefore been made over the last decade to perfect 
the delta modulator (DM) channels for video and voice transmission. The 
various solutions suggested are all confronted with the difficult task 
of finding a good technical compromise between the conflicting require- 
ments to achieve a high transmission fidelity with the minimal possible 
hardware complexity. 

When trying to investigate the performance of a DM in picture 
transmission, the typical video signals obtained from scanning a picture 
in the conventional manner should be considered. These video signals 
are characterized by frequent voltage discontinuities of large amplitudes 
and very short "rise times" that correspond to. abrupt changes in the gray 
levels of the picture (at its "contours"). In what follows, the delta 
modulator is considered to be in the "acquisition" mode when trying to 
catch up with these fast transitions. It shall be considered to be in 
the "tracking" mode during the other periods in which it "tracks" a con- 
stant voltage level (that corresponds to a constant shade, or "gray level," 
that follows a transition). It should be stressed that the requirements 
imposed on the channel design by the two mentioned modes are of a con- 
flicting nature. A fast acquisition capability will reduce the "slope- 
overload" of the channel but will cause a high "granularity" during the 
tracking periods. ' • 

It seems that most of the adaptive delta modulators (ADM) described 
in the literature assign a higher preference to a high acquisition speed 
rather than to smooth tracking. A fast acquisition behavior usually leads, 
however, to instabilities in the signal reconstruction mechanism. The 
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estimate of the signal then "overshoots" when the channel switches from 
the acquisition mode to the tracking mode, producing a "multiple-edge" • 
effect in the received picture. It sometimes also overshoots during the 
acquisition time itself, leading to unwanted "edge busyness" effects. 

Attempts to alleviate these instabilities and to .ensure that they 
do converge fast enough to the new level have recently been proposed. 

Some methods rely on stability bounds imposed on the channel parameters 
that control the acquisition speed. Others suggest overshoot-suppression 
(OSS) algorithms that try to smooth the transitions between the two oper- 
ating modes. 

The tri -state delta modulator (TSDM) achieves improved performances 
over the conventional one (COM) by breaking the inherent interdependence 
between the acquisition and the tracking modes of operation. This inde- 
pendence between the two abovementioned modes is obtained by adding a 
third state (the "level" state) to the usual "rise" and "fall" states of 
the COM. 

Thus, a conventional delta modulator is clearly a "zero-error 
seeking feedback loop" type of device. It produces an estimate, Xj^, for 
each (sampled) value of the signal, Sj^, applied to its input. The dif- 
ference between them (the "error") is then used by the device for updating 
the next estimate. It is therefore obvious that such a device will operate 
properly only with nonzero errors and this is -the basic cause for the 
"granularity" effect inherent in any. delta modulator when tracking con- 
stant voltages. This behavior of the COM in tracking "levels" can also 
be linked to the differential nature of the device which enables it only 
to detect (and transmit) changes in the signal voltage. 

In comparison, the tri -state delta modulator can detect and handle 
"no-change" (level) situations in addition to the "up" and "down" changes. 
It ‘therefore incorporates the "zero-error" case as a third and valid state 
in its operation. This third state eliminates the very source of the 
granularity effect mentioned above. Therefore, the TSDM device is char- 
acterized by the following three mutually exclusive states: 

1. The state in which the voltage level of the signal estimate 
is continuously rising (the "rise" state). 

2. The state in which the voltage level of the estimate falls 
continuously (the "fall" state). 
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3. The state in which the estimate equals the signal (within a 
given voltage tolerance, s^); this "level" state is what makes the TSDM 
device different from its predecessors. 

Specifically, a sensing device which continuously splits the video 
signal or its estimate (as the case may -be) into the three activity type 
components forms the basis of the three-level delta modulator. Thus, . 
the TSDM generates sequences (runs) of +l's during the entire period of 
time when the estimate of the signal is rising. It generates strings 
of -I's when the signal is falling, and 0‘s during level situations. A 
level situation is declared, at time sample k+1 , whenever 
for i = l,2,..., and the length i of the run of zeroes so- obtained will- 
stop increasing when this inequality first fails. 

Advantages of the Tri -State Delta Modulator 

From the above description, the main inherent advantages of a delta 
modulator based on such an "activity" sorting device could be summarized 
as follows: . 

1- Ease in the run-length encoding of the signal . This could 
facilitate the implementation of simple bandwidth preserving schemes in 
the pertinent communication links, as the large portions of any picture 
merely contain background information that easily qualify as "levels" over 
long periods of time. These levels can easily, be extracted from the video 
signals and properly coded. 

2. No granularity in the received signal . Another important 
advantage of the TSDM has to do with the mechanism used to reconstruct ' 
the video signals at the receiver end of the channel. Here, the split- 
ting into three activity elements, rather than two, facilitates the 
tracking of the "levels" without the "granularity" associated with the 
two-state conventional delta modulators. 

3. Ease of hardware implementation . Implementing the TSDM appears 
simpler than would have .been expected. The reason for this is that the 
"level" zone defined by ±e^ can be easily generated within such electronic 
sensing devices as voltage comparators. 

4- Ease of applying smoothing techniques to noisy signals . The 
voltage zone of ±e^ a.lso provides a powerful, yet extremely’ simple- to- 
implement (and to preset) tool for smoothing the video signals and for 
"clearing" them of excessive noise. 
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5. More efficient OSS algorithms . Since the overshoots in the 
voltage level of the estimate pass through one of the three states 
when entering the tracking mode, their recovery times can be significantly 
reduced. The increased number of degrees of freedom therefore permits 
more efficient utilization of possible OSS algorithms. 

6.2 Detailed TSDM Operation Description 

6.2.1 Functional Block Diagram Description 

The functional block diagram of the TSDM is shown in Figure 51. 

Note that, with the exception of the dead zone in the hard limiter char- 
acteristic, this block diagram is similar to the one of a conventional 
delta modulator (CDM). It is the dead zone in the limiter characteristic 
which provides this modulator with the tri-state extraction capability. 

The output of the modulator is then a state indicator, Bj^, which provides 
the information about the type of "activity" characterizing the current 
signal estimate. 

The Bj^ state indicator lends itself easily to run-length encoding 
schemes. The information can be thus encoded into either one of the 
following outputs: 

(1) A stream of two-bit serial words; or 

(2) A stream of three-bit parallel words. 

If the second approach is taken, the modified block diagram of 
Figure 52 applies. The notations shown in the second block diagram can 
be used conveniently for describing the encoding of the vector into 
a two-bit serial word which is entered into the estimator. The transla- 
tion used is given in Table 9. 


Table 9. Representation of the State Information 
[Note that = (r|^ - fj^)] 



1 0 0 0 0 0 
0 1 0-11 1 
0 0 1 1-0 1 




Figure 51 . Block Diagram of the TSDM 



To Encoder 


-Figure 52. Alternate Block Diagram of the TSDM 
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6.2.2 Operational Equations 

The equations describing the operation of the tri-state delta modu- 
lator follow closely those of the. conventional ones with modifications as 
indicated below. The device will behave exactly like a conventional 
adaptive DM in the acquisition mode. During each tracking mode situation, 
however, the estimator will simply keep repeating the voltage level esti- 
mated when the tracking mode has first- been' entered. 

The equations, in particular, the one calculating .the value of 
the coming step size, can be considered as the generalized versions of . 
those describing the conventional DM. The definitions used are given 
below. According to the usual convention, k indicates the ^th time 
sample. Let 

^k ~ (sampled -and digitized) signal to be transmitted 

X|^ = the estimate of Sj^ 

rj^ = an indicator, which when set to 1, indicates that the 
voltage output of the estimator need "rise" for meet- 
ing the signal 

fj, = indicator for "falling" 

= indicator for maintaining the steady-state condition 
(level) . 

= any "activity" element of above (rj^, fj^, or Jij^) 

Bj, = numerical representation of . . 

It is also required that the state indicators shall be mutually 
exclusive events,- i .e. , with 



0 or 1 5 and letting 


" ^1 
" ^k2 ■’ 
"k3 

J 


equation (6.1) below should be satisfied: 


3 

’ '■k ^ fk ^ \ ■ 


( 6 . 1 ) 



T22 


D 


k- 


h ^ 


h ~ ^k-l \ 


Also let 
with 

as in the CDM. Defining now, 

sign (D. ), whenever iD, 1 > e 

' k' V 

0 , otherwi se , 

we have (for the state-selector used) the following requirements; 

• •><’ > 0 
0, otherwise 


f. 


-^k > ''f Bk " “ 
0 , otherwise 


1 , whenever | Dj^ | < or = 0 
0 , otherwise . 

The. new equation for calculating the next step size is now,: 

■ Vi ■' [“('■k-fk) -"B^k-i-Vi’^f’-V 

with. = 2 AqB^ Whenever < 2 Aq and £j, f ] 

^k+1 " ^^0®k w*i®never = 0 and = l . 


{6.2} 

(6.3) 


(6.4a) 


(6.4b) 

(6.5a) 

(6.5b) 

(6.5c) 


-In (6.5a) and (6.5b), 2 Aq is the minimum permitted step size in the 
acquisition mode. Also, terms a and $ are the same positive constants 
which are used to define the operation of the conventional (adaptive) delta 
■modulators. Note also that (6.5a) can be written as 

•^k+1 ‘ ^^k-l^‘ l^k^ ’ (6.5d) 

The in (6.4) represents, as already mentioned, a small positive 
constant that takes into consideration the voltage tolerance levels 
inherent in the amplitude comparators or the signal smoothing requirements. 
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When EIP (essential information preserving) rather than SIP (strictly 
information preserving) systems are designed, the value of will have 
to be externally predetermined (as an additional parameter) to meet the 
required smoothing levels. This voltage tolerance parameter can there-' 
fore, when .needed, serve as a powerful data compression tool in picture 
transmission systems. 

From the equations presented above, it is evident that the opera-, 
tion begins with the minimal step size every time the- device leaves .the 
level state. Such an approach provides for simplification of device 
implementation without. limiting significantly the acquisition speed. 

It must be noted also that the demodulator, as in the CDM case, 
is an accurate repl ica of the estimator portion .of the transmitting 
modulator. 

6,2.3 Implementation Block Diagram 

As stated in the preceding section, the equation which defines the 
next incremental step as a function of the current and the previous 
state parameters can be written in the form given in (6.5d). The block 
diagram which carries but the function of this equation, as well as other 
functions of the delta modulator, is shown in Figure 53. Its salient 
subunits are described below. 

6. ■2. 3.1 State Extractor 

The state extractor subunit accepts the quantized signal sample 
Sj^ and subtracts from it the present estimate Xj^, thus forming the differ- 
ence Dj^, The difference D|^ is then applied to a logic unit which compares 
^v value). Depending on' the results of this comparison, 

the logic. unit outputs' the present activity state information vector Bj^. 
The B|^ indicator is applied to the encoder for the ultimate transmission 
to. the receiver; it is also applied to the'estimator. 

6. 2.3. 2 The Estimator 

The estimator subunit accepts the current state information Bj^ 
from the state extractor, delays it until the next sampling period to 
provide and uses both the Bj^ and Bj^ .j information for generating 

the next .step, size . The estimator subunit' also uses the previous 
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estimate Xj^_-j and the current step Aj^ for generating the current estimate 
X|^. The functional subunits comprising the estimator are described below. 

(1) The Limiter . The limiter unit keeps the maximum and minimum 
values of the estimate Xj^ within the bounds set for this signal. 

(2) The Registers . The current values of Xj^, and are ' 
stored in their respective registers. The required time delays are imple- 
mented by clocking data in and out of the registers at the appropriate 
times. 

(3) The Adder . The function of the adder is to generate the new 
(i.e., present) value of by summing the current step with the pre- 
vious estimate of Xj^ -j. 

(4) The Decision Unit . The decision unit is the most 
important block of the estimator. It determines the magnitude of the 
next step according to the current and past values of and the current 
step Aj,. Table 10 defines the values of Aj^^^ as functions of Bj, and Bj^ 


Table 10. 

Outcomes of A, ,, 
k+l 

Decision 

Unit Versus and Bj^ ^ 

y 



■ ■ \-fl 

^0 

0 

0 

0 

^1 

0 

-1 

0 

^2 

0 

1 

0 

^3 

-1 

0 


^4 

-1 

-1 

-(a+ B) • lAj^[ 

^5 

-1 

1 

-(«-6) • iAj^l 

^6 

1 

0 

■ -^2Ao 

^7 

1 

-1 

(a- 3) • |Aj^| 

^8 

1 

1 

(«+ 6) • iAj^t 

Note: 

l\l i 
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From Table 9, it ‘is evident that the decision unit will do the 
following: 

(a) It will clear the Aj^^^ register when Bj^ = 0, 

(b) It will set the register to 2 a^ and assign the sign of 
to. this step when = but Bj^?eo. 

(c) For all the, remaining possible situations, it will follow the 
behavior of an adaptive conventional’ delta modulator (COM) defined by 
parameters of a and 3. 

As shown in Table 10 there are nine pos.sible input combinations of 
•^k ^k-T Furthermore, out of these nine combinations, the number that 
need distinct handling can be reduced, thus simplifying the implementation 
of the TSDM. To allow such simplification, values-a=l and 3=1/2 must 
be used. These values, however, have gained wide acceptance in COM and 
thus are the logical candidates for use in TSDM. 

When a=l and 3=1/2, the ratios between A|^_j_,j. and Aj^ can only be 
either ±1-1/2 or ±1/2, not considering the cases when Aj^^^ must be equal 
to 2 Aq. Because dividing by 2 is equivalent to shi'fting to the right 
one time and truncating the result, multiplication by 1-1/2 is easily 
obtained by adding Aj^ to its shifted version. In this manner, the multi- 
plication operation is reduced to a simple shift operation or to a shift- 
and-add with the proper sign bit, taken into consideration. 

Pertinent to such an implementation, as well as to -its variations'j’ 
is the timing of various events which take ‘place each time the input is 
sampled and operated upon. A typical version of such a timing sequence 
is presented below., 

6.2.4 Timing Sequence, for TSDM with a=l and 3=1/2 

Not considering the overshoot suppression (OSS), the timing sequence 
•for each operation cycle of TSDM may typically take up to ten phases, each 
phase determined by its timing signal C^. The sequence and the functions 
of these timing pulses are as follows: 

Timing Pulse C,j 

The analog video signal is sampled and quantized. 
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Timing Pulse C 2 

(1) The output of the quantizer is transferred to the S. 

register. , ^ 

(2) The estimate of Sj^ is transferred from the limiting 
network to the register. 

Timing Pulse 

The estimate Xj, is subtracted from Sj^, resulting in output 
°k' 

Timing Pulse' 

(1) The magnitude of Dj^ is compared to the contents of the 

Gy register. ^ 

(2) The two-bit word B. is generated in terms of b,^ and 

bk k = 1 

Timing Pulse Cg 

( 1 ) Bj^ is clocked into a pair of flip-flops. 

(■ 2 ) is transferred to. a second pair of flip-flops. 

(3) The contents of the A|^ register are transferred into 
the "shifted register. 

(4) The trailing edge of C 5 applies Bk and Bk.-] to a logic 

network which determines y. control signal (s). accord- 
ing to Table 10. ^ 

Timing Pulse Cg 

Control signal (s) yk are "anded" with C 5 and decision to 
clear either one or both (Au and "shifted Ak") registers 
is made according to Table 11, the significance of which 
will be explained later. 

Timing Pulse Cy 

The outputs of Ak and "shifted Ak" registers are added to 
give the magnitude of . 

Timing Pulse Cg 

(1) The value of iAk+]| is compared against the stored 
value of 2 Aq, the latter being the minimum s.tep used. 

(-2) Control signals which determine whether 1 a. J or 2 a^ 
is used for the next step are. generated. ° 

Timing. Pulse Cg 

Either or 2 Aq is passed to the adder/subtracter. 
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Timing Pulse C-jq 

(1) Depending on the state of b-|*^ , Ak+i or is either' 
added to or subtracted from the X|^ value. ^ 

(2) The result of the addition (or subtraction) is passed ■ 
to the limiting network where it is processed and 

• stored' to be transferred to the x. register upon the 
reoccurrence of C-j . ' ^ ■ 

Timing’ Pulse C-ji 

If the OSS is implemented, it is processed after Timing Pulse 
CiQ. However, as described in section 6. -2. 7, additional 
registers and flip-flops are required and their contents .are 
operated by Timing Pulse 

The sequence then repeats with the onset of the next Timing Pulse 

C.,. • . 

6.2.5 Logic for Clearing Ai^ and Shifted Aj, Registers 

The logic for clearing the Aj^ and' shifted registers is determined 
by the states of B|^ and . This logic is summarized in Table 11, which 
also includes some of the information of Table 10 of section 6.2.3 to 
indicate how either the values or the coefficients of the Aj^ step are 
determined. 

Table 11. Logic for Clearing the Aj^ and Shifted Aj^ Registers ■ 


Required 

Incremental 



^k-1 

. • Step 

0 

0 

0 

0 

-1 

0 

0 

1 

■ 0 

-1 

0 

“^'^0 

-1 

-1 

-1.5. lA ^ 

-1 

1 

0.5. |a 

1 . 

0 


1 

-1 

t 

-0.5 • |A|, 

1 

1 

1.5. (4^1 


Clear a^ Clear Shifted 
Register? Register? 


Yes 

(redundant) . 

.Yes 

(redundant) 

Yes 

Yes 

Yes 

Yes 

Yes 

(redundant) 

Yes 

(redundant) 

No 

‘ No ’ • 

Yes 

No 

Yes 

(redundant) 

Yes 

(redundant) 

Yes 

No 

No 

No 
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Most of the required, i.e., nonredundant, deci'sions for clearing 
or not clearing the and shifted registers are obvious from deter- 
mining the coefficients of the |A|^| terms in Table 11. The decisions to 
clear the registers for the conditions associated with =0 are con- 
sidered redundant because for this condition the contents of both registers 
under consideration are zero anyway. Thus, the decisions to clear both 
registers for case B|^_-j'=0 should be utilized only if such operation 
simplifies the logic or is used to preserve the synchronism of the oper- 
ational cycle. 

6.2.6 Parallel Data Handling for High Speed Operation 

In section 6.2.4, the timing sequence for the tri-state delta 
modulator was presented and the functions of the various phases of this 
sequence were described. The timing sequence presented there was based 
on the assumption of a. serial digital data generation and processing. 

Such serial data handling, however, requires excessive clock speeds, 
which in. turn may unnecessarily complicate the implementation. This 
applies particularly to the Analog-to-Digital (A/D) unit which typically 
consists of a sampler and a serially operated digital quantizer. 

Our investigation of parallel data handling techniques and of the 
corresponding state-of-the-art digital devices indicates that parallel 
handling of the arithmetic functions within the estimator loop can permit 
intersample periods as short as 100 nsec and possibly even as short as 
50 nsec. 

Assuming that a 50 nsec intersample period is achievable, two 
samples per picture element may be performed with potentially significant 
improvement of the modulator performance during the fast and large signal 
level changes. Such deliberate "oversampling" may serve as a good protec- 
tions against slope overloads, which are the common bane of even the 
adaptive delta modulators. 

Figure 54 shows the block’ diagram of the tri-state delta modulator 
based on parallel data handling. Note that the salient feature of this 
block diagram is the absence of the sampler and quantizer at the input 
to the subtractor. Instead, the analog estimate Xj^' (the prime indicates 
analog form) is' subtracted from the analog video signal and the analog 
difference Dj^' is applied to the comparator and s.tate extractor. Since 
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all of the functions’ of the estimator unit are based on the values of 
and , the latter being stored in the control logic unit, the estimator 
is simply a comparator which has to generate only a single two-bit word 
per sample frame. The word generated in this case supplies the logic 
with only one of the following three possible messages: 

^ s' (signal rising) 

(2) Dj^' > (signal falling) 

(3) i je^'l (signal within dead zone defined by ) 

The third message for the tri-state delta modulator implies that the signal 
is not changing, i.e., a "level" state is achieved. 

Thus, one can see that a multi-level, multi-bit signal amplitude' 
estimator (A/D converter) can be replaced by a far simpler, and faster, 
tri-state comparator and state extractor. Such simplification is made 
possible by using a Digital-to-Analog convert (DAC) for transforming the 
digitally computed signal estimate into an analog voltage Xj^' . In 
this case, we take advantage of the fact that the D/A conversion is 
generally simpler to perform than A/D conversion. 

The operations within the estimator are performed as shown in the 
block diagram of Figure 54, and they generally follow the timing sequence 
described in section 6.2.4 with the exception pf some modifications in 
the timing sequence. These modifications are the result of changing the 
digital data handling operations from a serial to a parallel mode. 

6.2.7 Overshoot Suppression (OSS) Impleme'ntation 

The TSDM described so far, despite its unique features, has all the 
characteristics of an adaptive delta modulator. The major advantage of 
an adaptive delta modulator over a linear version are: (a) reduced slope- 

overload and (b) lower level of granularity noise. The penalty paid for 
these advantages, however, is an increased level of overshoots and of the 
accom’panying oscillations. Consequently, to minimize the effects of the 
overshoot and oscillations which accompany large transitions from one video 
signal level to another, an overshoot suppression technique should be con- 
sidered for the tri -state delta modulator. 
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In reference [25]. which describes techniques for overshoot and 

undershoot suppression, the former is detected when, the transmitted B, 

k 

sequence is 1, 1, -1, -l and, correspondingly, the undershoot is detected- 
when the sequence is -1, -1, l, i. Such sequences, however, can occur 
following a large change in direction of the video signal. Thus,, the - 
overshoot suppression algorithm may introduce an undesirable smoothing 
of the- transmitted signal. Furthermore, if the adaptive steps are small., 
the above patterns could occur in the normal tracking of the changing 
video signal. 

To overcome the aforementioned problems, it is proposed that two 
extra delays be, added and the overshoot and undershoot be detected accord- 
ing to the following pattern of 

Overshoot: 1, 1, 1, -i , i 

Undershoot: -1 , -1, -1 , 1 , 1 , -1 . 

Note that the six-character sequences proposed contain in their middle 
the four-character sequences described in [25]. But the added bit at the 
beginning of the sequence insures that the step is Targe enough for an 
overshoot to have occurred, and the sixth bit at the end indicates that 
the overshoot has definitely occurred and that the signal is not continuing 
downward. Similar reasoning applies to the two extreme bits of the under- 
shoot indicator sequence, i 

Figure 55 shows the proposed implementation for the overshoot sup- 
pression (OSS). The implementation shown applies to the receiver, 'i .e. , 
the demodulator portion of the- digital TV link. As shown in the figure, 
the incoming serial input is converted into parallel outputs of b^ and b-j . 
For each j^th sample clocked out of the serial-to-paral Tel (S/P) converter 
at the frame clock rate Cp, the b^^ and. b-j*^ bits are deposited in their 
respective registers to store the corresponding value of Bj^. Thus, a 
continuously updated sequence of six values of i.e., B|^ through B. g, 
are supplied to the .OSS Detect Logic Unit. 

As in the transmitting modulator, the estimator portion of the 
receiver shown in Figure 55 determines the values of Xj^ and based 
on the information available from state vectors Bj^ and B|^ However, 
the signal. applied to the digita1-to-analog converter is delayed by two 
sample periods. In other words, analog video output is derived from X 
instead of Xj^. ’ 



Figure 55. Overshoot Suppression (OSS) Implementation for Demodulator 
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Because the values of and are available to the estimator at 
the onset of the- X |^^1 formation cycle, a typical cycle, not involving OSS, 
is as follows: 

Step 1.: Form according to 

^k+1 " 1\1 • + ^\-l^ ■ l\l 

where Aj^ = present step 

a,e = weighting factors 

B. = present activity state information 
(0, 1, or -1) 

, = previous sample activity state information 
(0, l,or-l). 

■ Step 2: Form -as 

^kfl " ^k ^k-H • 

Step 3: Shift, registers to the right, set X ,^_2 > ^k-1 

^k+r i^eturn to Step 1. 

Now, if an OSS condition is present, the detection logic puts out an extra 
clock pulse Because this extra pulse precedes the next "normal" 

frame clock pulse Cp,** the following operation takes place after Step 2: 

Step 2a: Set Xj^ to Xj ^_2 ^k_r ^k"°* 

This extra step is then followed by Step 3 as in normal operation. The 
results are different, however, .because now we have set Bj^ = 0 in Step' 2a, 
and this will prevent the .OSS pattern from occurring for at least six 
sampling time intervals. Furthermore, setting Bj^ = 0 will also cause the 
size of the-next step, if required, to be the minimal size of 2 Aq as 'shown 
in Table 11. Such selection takes place when Bj^ = 0 is shifted by Cp to 
become B|^_.j = 0 for the subsequent frame-. 


* 

Similar to Ct, of section 6. 2. 4.' 

I I 

Similar to C-j of section 6.2.4. 
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7.0 ASSESSMENT OF PREVIOUSLY CONTRACTED EFFORTS 

7*1 NASA/ Ames Three-Dimensional Hadamard Transfer Coding System • 

A three-dimensional coding system that would take a transformation 
in temporal direction, as well as transformations on the spatial domain, 
exploits the frame-to-frame correlation, as well as the correlation in the 
temporal direction, thus achieving a better coding performance than the 
two-dimensional transform coding methods. Although this is possible using 
any one of the unitary transformations, the complexity of the system is 
■reduced by using a transformation that requires minimal hardware complexity 
for transform implementation, such as the Hadamard transform. 

A three-dimensional Hadamard coder, which employs 2x2 pixel blocks 
by 2 frames in time, is currently under investigation at Bell Telephone 
Laboratories. This system exploits the correlation of the data in both 
spatial and temporal directions, requires only one frame of memory, and 
is amenable to simple hardware -implementation. However, this coding tech- 
nique is not highly efficient, due to the rather small number of pixels 
(a total .of seven) used to exploit correlations in the data. Hardware 
for this encoder is in the development stage at Bell Telephone Labora- 
tories [26]. 

NASA/Ames Research Center at Moffett Field, California, has developed 
a three-dimensional Hadamard transform coder that uses a block size of 
4x4 pixels and 4 frames in time. This coder is designed to reduce the 
bandwidth of high resolution satellite television signals transmitted at 
30 fps. It employs a 4x4 pixel sliding window technique to eliminate visual 
edge effects due to the Hadamard encoder. 

The encoder is designed for standard U.S. commercial television 
signals; thus, it uses 525 lines per frame. However, each line is sampled 
to generate 512 samples per line. This corresponds to 8 megasamples per 
second on a corresponding 4 MHz analog bandwidth for the original tele- 
vision signal. A functional block diagram of this encoder is shown in- 
Figure 56. The Hadamard transformation is applied to sub-blocks of 4x4x4 
samples. The bandwidth reduction is obtained by assigning more bits to 
lower coefficients and less bits to higher coefficients. The system is 
capable of operating from off-air television signals, video tapes or tele- 
vision cameras. A block diagram showing the field demultiplexing essential 
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to perform the three-dimensional Hadatnard transform of interlaced signals 
is shown on Figure 57. The block diagram of the decoder is shown on 
Figure 58. The system can be programmed to operate at various bit rates. 
It also has an adaptive mode of operation that selects a combination of 
vectors and cut points best fitted to data. 

• The shortcoming of this system is the large storage requirement 
which is needed to store 4 frames of data. The system can be modified to 
encode color television by operating on the illuminance and the chroma- 
ticity components individually. This will require storing as many as 
12 frames of data at the transmitter, though it is conceivable that this 
number could be reduced due to the fact that the chromaticity signals 
require a much smaller bandwidth. The performance of the existing encoder 
operating on the frame-sequential color video signal is anticipated to 
be rather poor, due to smaller correlation in the spectral bands as com- 
pared to the correlation of sequential frames in monochrome video signal. 

7.2 LINKABIT Three-Dimensional Hybrid Encoder 

Three-dimensional transform coding systems suffer from the short- ‘ 
coming of excessive storage requirement which is needed to store previous 
frames. A three-dimensional hybrid encoder that uses a two-dimensional 
transformation on the spatial domain cascaded with a DPCM encoder in the 
temporal domain will require storing only one 'frame of data and should 
perform better than the corresponding three-dimensional encoders. 

Before discussing the LINKABIT approach in detail, it is worth 
noting that an investigation of the hybrid' encoder using two-dimensional 
cosine transform cascaded with a DPCM encoder for reducing the bandwidth 
of RPV imagery is underway at Naval Undersea Center (NUC) in San Diego, 
California [27]. 

This hybrid encoder exploits spatial correlation of a television 
image by taking a two-dimensional discrete cosine transform and exploits 
temporal correlation of the data by using a DPCM encoder. It is antici- 
pated that this system will reduce the number of binary digits needed 
for reconstruction of television at the receiver by a factor of about 5 
over the two-dimensional hybrid encoder. This system can be modified to 
encode illuminance and the chromaticity components .of a color video signal 
This will require storing three frames of data at the transmitter. The - 



ENCODE SYSTEM 


TV Camera 


A/D Converterh 


Field Demultiplex: 
Sequentially Outputs 
Sets of 8 Fields 


(6 Bits) 



De-Interlace and Data Ordering Unit 




64 Pels in Subpicture 


Memory to Save 
8 Fields of 
Information 


Outputs 8 Fields 
Simultaneously 




Hadamard Transform Unit 


Pel — »» Line Transform 
Line Area Transform 

Area Volume Transform 


64. Hadamard Vectors 


Vector Selector 


Deletes Vectors or Represents 
Them With 2, 3, or 4 Bits 




V 


Compressed Data 




CO 

03 


Figure 57. Field Demultiplexer 






DECODE SYSTEM 









140 


performance of this, system operating on the frame-sequential color video 
data is expected to be inferior to its performance for monochrome tele- 
vision signals. 

The LINKABIT scheme is significantly simpler than the NUC design. 

In the LINKABIT case, a two-dimension 4x4 Hadamard transform is computed 
per frame and only the difference of transform coefficients from frame 
to frame is quantized. This technique has many of the advantages' of 
other three-dimensional algorithms but considerably less memory is 
required and there is no rate buffering due to variable scene activity. 

Conceptually, this scheme can be thought of as transmitting the 
compressed two-dimensional Hadamard transform coefficients for the first 
frame. The two-dimensional .quantization is performed according to [28]. 
Figure 59 shows the 16 Hadamard subpictures (basis vectors).. The numbers 
in parentheses below each subpicture are the number of bits of quantization 
allocated to the corresponding coefficients. The total number of bits for 
16 pels (the sum of the number in parentheses) is 32, or 2 bits per pel. 

For three frames after the first (reference) frame, only quantized 
differences between the compressed transform coefficients of a subpicture, 
and the corresponding subpicture in the' past frame, are transmitted. 
Actually, only differences on three of the 16 coefficients are sent for 
a total of 11, 11 and 10 bits sent per 16 pel array in the second, third, 
and fourth frames, respectively. This results' in an average of one bit 
per pel. This procedure is repeated by again transmitting a compressed 
two-dimensional frame independent of previous transmissions. The addi- 
tional compression over that obtained with a two-dimensional Hadamard 
transform scheme is possible because the difference coefficients can be 
quantized much more coarsely than the two-dimensional transform coeffi- 
ci'ents themselves. The smaller memory is possible because it is suffi- 
cient to store a compressed, rather than an uncompressed, form of the 
reference frame. 

The encoder accepts as input a standard composite NTSC black-and- 
white video signal. The outputs of the encoder are a compressed 8.064 Mbps 
data stream and its associated clock. The received 8.064 Mbps compressed 
stream and its recovered clock form the input to the decoder. The 
decoder output is a reconstructed NTSC composite video signal. Two 
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audio channels, sampled and 8-bit quantized at the video line rate {15.75 
kHz), are multiplexed into the compressed video data stream during the 
horizontal sync pulse time. The encoder also inserts a sync word in the 
output data stream every four frames. Circuitry is provided in the 
decoder to acquire and track this sync word. 

Figures 60 and 61 are block diagrams of the video compression 
encoder and decoder, respectively. 

Operation of the encoder proceeds as follows: The composite video' 
input signal is amplified, DC restored and 8-bit A/D converted. The 
4-line buffer stores -quantized samples on four video lines and serially 
provides the Hadamard transformer with sets of 16 samples within 4x4 
square arrays. The transform coefficients are further quantized with 
the number of bits per coefficient given in Figure 59. The quantization 
is nonlinear following [28]. 

To minimize data rate smoothing, quantized Hadamard coefficients 
are sent’for one-fourth of the 4x4 pel arrays in each frame (these are 
called refresh arrays). Quantized coefficient differences are sent for 
the remaining three-fourths of the 4x4 arrays. For refresh 4x4 arrays, 
the switches are in the positions shown in Figure 60. The quantized 
coefficients are stored in the Main (frame) Memory and are* sent through 
the Rate Buffer to form the .output data. For differenced-4x4 arrays, both 
switches in Figure 60 change positions. The Quantizer outputs are sub- 
tracted from the corresponding refresh array coefficients at the Main 
Memory output. These coefficient differences are further quantized and 
forwarded to the Rate Buffer. 

All timing signals and control signals are derived by phase locking 
to the horizontal and vertical sync stripped from the composite input 
signal. 

The decoder of Figure 61 performs essentially the inverse functions 
to those of the encoder. Inverting the encoder quantizing operations 
involves assigning "representative values" to quantization intervals. 

The decoder also acquires synchronization and regenerates composite sync 
which is added to the reconstructed video signal. 
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Figure 60. Video Compression Encoder 
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Figure 61. Video Compression Decoder 
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7.3 TRW 2D-DPCH for Color Data Compression 

TRW analyzed the most economical approach to provide data compres- 
sion of field-sequential color and NTSC color. In both cases, the most 
economical technique was 2D-DPCfi. 

Analysis of the field-sequential' data, along with the weight and 
power sizing of the encoder, lead to an approach that consists of using 
the green (G) field instead of the illuminance, and generating the chro- ' 
maticity signals by subtracting G from the red (R) and blue (3} fields. 
This approach is fruitful if the chrominance signals contain less energy 
and possess a smaller bandv^idth than the original fields. Then the 
chrominance signals can be subsampled by taking every other sample and 
encoded using a coarser quantization, thus resulting in a further band- 
width compression. Table 12 shows the statistics of the G and two chro- 
.maticity signals for three typical fields of the field-sequential data. 
Figure 62 shows the power spectra of the green, as well as the chroma- 
ticity, signals. These results indicate that the chrominance signals 
indeed possess a smaller energy and a lower bandv/idth than the original 
fields. 

Table 12. Statistics of the Field-Sequential G, R-G, and B-6 Fields 


Fields 

Minimum 

Maximum 

Average 

Standard 

Deviation 

G 

0.0 

255 

117.05- 

51.77 

R-G 

-101 

60 

8.03 

12.48 

B-G 

-88 

77 

1.80 

9.30 


The above processing of the field-sequential data results in some 
reduction in its bandwidth. This is due to subsampling of the R-G and 
B-G fields. A 2-to-l subsampling of R-G and B-G results' in a total band- 
width compression ratio of 1.5 to 1. To achieve additional bandwidth 
compression, the G, R-G, and B-G fields must be encoded. The perform- 
ance of the candidate techniques considered by TRW [18] are shown on 
Figure 63. The two adaptive DPCM systems' have essentially identical 
performances. . The performance of the hybrid encoder, on the other hand, 
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Figure 62. Relative Power Spectra of 6, R-G, and B-G Fields 


Bits per Sample 






is almost the same as the performance of the nonadaptive DPCM encoder. 

The difference in the performance of the adaptive and nonadaptive DPCM 
system is fairly small at 3 bits per sample. At 2 bits per sample, the 
difference is about 3 dB in signal -to-noise ratio and may be significant 
for some applications. On the other hand, the complexity of the adaptive 
DPCM encoder is much greater than the complexity of the nonadaptive DPCM 
encoder. For this reason and the fact that the lighting will be well 
controlled in the Shuttle, the nonadaptive DPCM encoder is selected over 
the adaptive DPCM system. The hybrid encoder was rejected because it 
requires more than twice the parts count of the nonadaptive -DPCM encoder. 

A block diagram of the proposed bandwidth compression technique 
for the field-sequential color TV is shown or Figure 64. The system 
starts its operation upon the receipt of the green -field. It is encoded 
using a 2D-DPCM loop and is transmitted using -3 bits per sample. The 
green field is also filtered by a 3-point banning filter, subsampled and 
stored in the field memory. This requires storing only 256 samples per 
line. Next, the R field is filtered, subsampled and combined with the 
G field to generate R-G for each line. R-G is encoded using a DPCM loop 
with a 4-level quantizer. Finally, the same procedure is used to generate 
and encode the B-G field. Since G is encoded in full resolution at 3 bits 
per sample, but R-G and B-G are encoded at. 2 bits per sample with reduced 
resolution, rate buffering at the output of th'e encoder is required. 



Switches and are In "A" Position when G is at the Input 


Figure 64. Block Diagram of Proposed Bandwidth Compression Technique for 
Field-Sequential Color TV 
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Figure 65 illustrates the operation of the precornpression green 
.field memory. Input video data from the A/D converter (8 MHz sampling 
rate) is applied to the Green Field Select Logic. Green field data is 
passed via the output select logic to" the video compressor circuitry. 

Also, the green field,. as well as the red and blue fields, is subsampled 
by taking every other sample in the subsampTing logic. As a green field 
emerges from the subsampling logic, it is loaded into the serial green 
field memory by means of the recirculate svdtch. This switch is activated 
by the recirculate^ logic; thus, data is returned to the memory during the 
next two successive fields (red and blue). This insures that spatially 
related video samples differing only in successive field numbers are 
subtracted from each other, and this difference is fed to the DPCM com- 
pressor circuitry. During the initial memory load time, the subtracter 
output is disabled, and the green field data is fed to the compression 
logic. Due to the serial nature of the data, 16K.CCD (charge coupled 
device) shift registers have .been chosen for implementation of the serial 
green field memory. 

The proposed Field-Sequential Color TV compression system requires 
a post-compression rate buffer memory. Thi.s requirement arises from the' 
unequal sampling' rates of the "Green" field and the R-G, B-G fields. In 
addition, the green field is encoded at 3 bits/ sample while the R-G and 
B-G fields are encoded at 2 bits/sample. Therefore, the output rate 
changes from 23.6 Mbps for the green, field to 7.8 Mbps for the remaining 
two fields. To maintain a constant output rate, a. buffer memory is 
•required to smooth the output rate to 13.1 Mbps.* Figure 65 shows the 
functional block diagram- of the proposed rate buffer memory mechanization. 
Because of the differing input and output data rates involved, a random 
access memory (RAM) has been chosen. The output data from the video com- 
pressor is input to the demultiplexer for double buffering- into the RAM 
memory. Double buffering is required so that no interruption of the input • 
data will occur while memory loading, takes place. Thus, each sample is 
shifted into the buffer register (a sample at a time) and loaded (rewritten) 
into the memory, 12 bits at a time. To .prevent data loss,, two buffer 
registers (double buffering) are required, one holding data for load, ' 


* 

The actual rate will be slightly higher due to inclusion of syn- 
chronizing signals. 




rigure 65 . 
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tne otner accumulating data. This technique slov/s the input and output 
memory data rates to where read/v/rite collision* can be avoided by simple 
priority logic. The double buffer slows the memory input rate to a point 
where, in the event of a collision, there is sufficient time to allow a 
data read before the data v;rite. This anti-collision control is provided 
by the Read/Write combiner and anti -col 1 i si on logic. Data read from the 
memory is stored in the output data buffer register and shifted to the 
transmission, link. Read, Write and Refresh address control are provided 
by the appropriate counters. 

For the standard NTSC color TV system, TRW proposes a modification 
so that the analog illuminance (Y) and chromaticity signals (I, Q) are- 
available in an unmodulated form. In the absence of such a modification, 
a comb filter is required to demodulate these signals. The proposed band- 
width compression technique for the NTSC Color TV uses a 2D-DPCM loop. 

In the. analog transmission of I and Q signals, they are low-pass 
filtered and multiplexed with the illuminance signals as shown in. Figure 
66. This technique is practical, since human vision is very insensitive 
.to high frequency' components of I and Q signals. Taking advantage of 
this property, TRW also proposes low-pass filtering of the I and Q signals. 
The passbands of these filters are about one-fifth of the illuminance 
signal. Maintaining a spatial resolution of 512 samples per line gives 
a spatial resolution of about 100 samples per -line for I and Q signals. 

A further bandwidth compression can be achieved by alternating the trans- 
mission of the I and Q signals with each line of the illuminance signal. 
The. receiver then restores the missing color component for each line by 
interpolating between the transmitted components for the previous and 
the future lines. The performance of such a system v/as evaluated at 
Bell Telephone Laboratories for the Color Picturephone [29]. There was 
no color degradation as a result of alternate transmission of the chroma- 
ticity signals. 

A block diagram of the proposed encoder 'is shown on Figure' 67. 

The illuminance signal is sampled at a rate of 7.8 Mbps and is encoded 
by a 2D-DPCM system at 3 bits per sample. The sampling and transmission 

* 

Read/write collision may be described as attempts to write into 
the memory at one location (address) simultaneously with an attempt to 
read from the memory at another location. 
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Figure 66. NTSC Color Composite Signal Waveform 
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of the illuminance signal takes place during the active period of the line, 
scan. During the period of blanking, flyback and the interval which is 
normally used for analog transmission of the modulated color signal, TRW 
proposes to transmit either I or Q signals. The corresponding time inter-' 
vals for commercial NTSC Color TV are shown in Figure 68. This arrange- 
ment gives sufficient time for transmission of 100 chrominance samples 
in the nonactive interval. Both illuminance and the chromaticity com- 
ponents can use the same 2D-DPCM encoder if additional memories are pro- 
vided to store 100 samples of I and 100 samples of the Q signal for use 
in the DPCM predictor. This additional memory and the memory required to 
delay I or Q for the active duration of a line scan are the only components 
that need to be added' to the 2D-DPCM encoder. 

The performance of this system would be very similar to the per- 
formance of the proposed system for Field-Sequential Color TV since they 
both use the same 2D-DPCM encoder for the bandwidth compression of the 
illuminance signal. The bandwidth of this system, however, is higher. 

To maintain the same spatial resolution as that of Field-Sequential Color 
TV signal requires 28 Mbps. 



Fioure 68, Active and Nonactive Portion of the Scan I ine 
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8.0 SUMMARY AND' CONCLUSIONS 

The techniques for data compression of digital television has been 
presented in this report. From the basic PCM techniques of digitizing 
a signal. Section 2.0 presented entropy-preserving coding such as Shannon- 
Fano and Huffman coding, as well as polynomial data compression techniques, 
such as the zero-order predictor, the zero-order interpolator, the first- 
order predictor and the first-order interpolator. Section 2.0 also 
presented predictive coding techniques such as DPCM, adaptive DPCM,- delta 
modulation (DM), and adaptive delta modulation (.ADM). An important form 
of ADM for digital television compression is tri-state ADM which was dis- 
cussed in detail in terms of its. implementation in Section 6.0. Trans- 
form techniques were presented in Section 3.0. Section 4.0 combined the* 
results of Sections 2.0 and 3.0 to present two- and three-dimensional 
techniques for digital television compression. Thus, the data compression 
techniques for black-and-white television were presented in Sections 2.0 
through 4.0. Section 5.0 extended these techniques to the NTSC color 
television system, as well as Field-Sequential color television used for 
spaceborne applications. A number of the, techniques have been the subject 
of contracted efforts to develop a feasible digital television system. 
Section 7.0 assessed the practicality of each of the systems under 
development. 

There are a number of conclusions that can be drawn from this study.- 
of digital television compression techniques. For black-and-white tele- 
vision, at moderate bit rates (15-30 Mbps) and where complexity is at a 
premium, the best choices are two-dimensional techniques. Three-dimensional 
techniques require frame storage which is a significant increase in com?- 
plexity. The best choice for moderate bit rates at the present are 
probably two-dimensional DPCM as developed by TRW [18]. Two-dimensional 
delta modulation and tri -state adaptive delta modulation are very promising 
from preliminary results, but these techniques have not been fully proven. 
For very. low bit rates (i.e., less than 10 Mbps), three-dimensional tech- 
niques are required. .Transforms will probably be required, at least for 
the spatial- coordinates. The best results to date have been obtained by 
NASA/Ames using a three-dimensional Walsh-Hadamard transform. However, 
the hardware complexity of the Ames system is formidable. The Linkabit ' 
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two-dimensional Walsh-Hadamard transform with frame differencing is prob- 
ably the next best choice for low bit rates. 

The best techniques for frame-sequential color are identical to 
the black-and-white schemes for moderate bit rates and small complexity. 
For. low bit rates, where three-dimensional techniques are required, the 
frame-sequential color will probably have to be converted to standard 
NTSC color. For NTSC color, the best choice to date seems to be the TRW 
system where the analog illuminance (Y) and chromaticity signals (I, Q) 
are available in an unmodulated form. In the absence of such a modifica- 
tion, a comb filter is required to demodulate these signals. The proposed 
bandwidth compression technique for the NTSC Color TV uses a 2D-DPCM loop. 
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